Skip to content

Implement OpenAI Responses API instrumentation and examples#4166

Open
vasantteja wants to merge 16 commits intoopen-telemetry:mainfrom
vasantteja:feat/instrument-openai-responses
Open

Implement OpenAI Responses API instrumentation and examples#4166
vasantteja wants to merge 16 commits intoopen-telemetry:mainfrom
vasantteja:feat/instrument-openai-responses

Conversation

@vasantteja
Copy link
Contributor

@vasantteja vasantteja commented Feb 5, 2026

Description

This PR adds OpenAI Responses API instrumentation (sync Responses.create, Responses.stream and Responses.retrieve) to opentelemetry-instrumentation-openai-v2, including streaming span lifecycle handling, token metrics, and request/response attributes. It also adds tests for Responses API behavior and updates docs/examples to show Responses usage.

  • Motivation: the Responses API is now a primary OpenAI interface; instrumentation should cover it just like chat completions and embeddings.

  • Dependencies: OpenAI SDK with Responses support (e.g., openai>=1.66.0) and jiter for streaming.

  • Upcoming: Instrumentation around async responses methods.

  • Callouts:

  1. Responses.parse function are left out of instrumentation.
  2. We have the following attributes available in Responses api. Should we add them?
  - gen_ai.conversation.id
  - gen_ai.output.type
  - gen_ai.usage.cache_creation.input_tokens
  - gen_ai.usage.cache_read.input_tokens
  - gen_ai.input.messages
  - gen_ai.output.messages
  - gen_ai.system_instructions
  - gen_ai.tool.definitions

Fixes #3436 partly

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Responses API tests with VCR recordings
source .tox/py312-test-instrumentation-openai-v2-latest/bin/activate
pytest instrumentation-genai/opentelemetry-instrumentation-openai-v2/tests/test_responses.py --vcr-record=all -v

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

- Added instrumentation for the OpenAI Responses API, including tracing for `Responses.create` and `Responses.stream` methods.
- Introduced example scripts demonstrating the usage of the Responses API with OpenTelemetry.
- Created a `.env` file for configuration, including API keys and OpenTelemetry settings.
- Updated README files to include instructions for running examples and configuring the environment.
- Added unit tests for the new Responses API functionality, ensuring proper tracing and metrics collection.

This update enhances the observability of OpenAI API interactions within the OpenTelemetry framework.
- Updated the OpenAIInstrumentor to conditionally wrap and unwrap the Responses API methods based on the installed OpenAI package version (>=1.66.0).
- Added version checks in the test suite to skip tests if the Responses API is not available, ensuring compatibility with earlier versions of the OpenAI library.
- Improved error handling for missing API methods to prevent runtime exceptions.
- Added pylint disable comments to suppress warnings for specific lines in the Responses API example and patch files.
- Updated the `responses_create` and `responses_stream` methods with links to relevant OpenAI documentation for better reference.
- Improved code formatting for readability by adjusting line breaks and indentation in the patch file.
- Reformatted code in the patch file to enhance readability by adjusting line breaks and indentation.
- Ensured consistent style for model retrieval and span name updates in the ResponseStreamWrapper class.
- Minor adjustments to import statements for clarity and organization.
- Updated the OpenAI package version in requirements.txt to 1.66.0 for compatibility.
- Refactored span name retrieval in the patch.py file to directly format the span name using operation and model attributes, removing the redundant _get_span_name function.
- Improved code clarity and consistency in the responses_create and responses_stream methods.
…omments

- Added a comment in the `responses_stream` method to clarify the purpose of avoiding duplicate span creation.
- Updated span name retrieval to use a default value of 'unknown' for the model attribute if not present, improving robustness.
- Refactored the `_record_metrics` function to directly access the request model from attributes, enhancing clarity and consistency.
- Added a new wrapper function `responses_retrieve` to trace the `retrieve` method of the `Responses` class, enhancing observability.
- Updated the `OpenAIInstrumentor` to include the new tracing functionality for the `retrieve` method.
- Enhanced test coverage for the `retrieve` method, including new test cases for both standard and streaming responses.
- Added new YAML cassettes to support the updated tests for the `retrieve` functionality.
- Added a TODO comment in the patch.py file to consider migrating Responses instrumentation to TelemetryHandler once content capture and streaming hooks are available.
- Included a reference link to the OpenAI responses.py file for context on the `retrieve` method.
- Introduced a new module `patch_responses.py` to handle tracing for the `Responses` class methods, including `create`, `stream`, and `retrieve`.
- Updated the `__init__.py` file to import the new responses patching functions.
- Enhanced test coverage with new YAML cassettes for various response scenarios, including standard and streaming responses.
- Removed outdated response tracing logic from `patch.py` to streamline the instrumentation process.
…strumentation

- Moved the `_record_metrics` function to the `utils.py` file for better organization and accessibility.
- Updated the `patch.py` file to import the `_record_metrics` function from `utils.py`, streamlining the code structure.
- Enhanced the `responses_retrieve` method to simplify span attribute checks and improve readability.
- Added new test cases and YAML cassettes to cover various response scenarios, including streaming and standard responses.
@JWinermaSplunk
Copy link

This is a relatively large PR, is there any way it could be split up?

@vasantteja
Copy link
Contributor Author

@JWinermaSplunk Sure I removed the async stuff to make it small. I will remove stream and retrieve to make it more small. FYI I am rewriting this with TelemetryHandler so that we can start using shared utils. I am assuming that might make the pr little bulky so can I remove the examples?

…ry support

- Added `opentelemetry-util-genai` as a dependency for improved telemetry handling.
- Refactored response handling in `patch_responses.py` to utilize `TelemetryHandler` for tracing.
- Updated `responses_create` and `responses_retrieve` methods to integrate new telemetry features.
- Simplified imports and removed unused code in `__init__.py`.
- Added extensive test cases and YAML cassettes for various response scenarios, including streaming and standard responses.
- Adjusted requirements files to include the new utility package for testing.
…try features

- Updated `responses_create` and `responses_retrieve` methods to streamline content capture logic.
- Introduced helper functions for extracting input and output messages, and system instructions.
- Enhanced telemetry support by integrating content capture based on experimental mode settings.
- Added new test cases to validate content capture functionality in various scenarios, including streaming and standard responses.
- Created YAML cassettes for testing response handling and content capture behavior.
- Simplified imports and removed unused code in `__init__.py` and `patch_responses.py`.
@@ -0,0 +1,6 @@
openai~=1.66.0

opentelemetry-sdk~=1.36.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beacause of "opentelemetry-api ~= 1.37", "opentelemetry-instrumentation ~= 0.58b0", in project.toml giving conflicting dependencies

changing to this works,
opentelemetry-sdk~=1.37.0 opentelemetry-exporter-otlp-proto-grpc~=1.37.0 opentelemetry-distro~=0.58b0

import os

from openai import OpenAI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add ?

from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

# NOTE: OpenTelemetry Python Logs API is in beta
from opentelemetry import _logs, metrics, trace
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import (
    OTLPLogExporter,
)
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import (
    OTLPMetricExporter,
)
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
from opentelemetry.sdk._logs import LoggerProvider
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

# configure tracing
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter())
)

# configure logging
_logs.set_logger_provider(LoggerProvider())
_logs.get_logger_provider().add_log_record_processor(
    BatchLogRecordProcessor(OTLPLogExporter())
)

# configure metrics
metrics.set_meter_provider(
    MeterProvider(
        metric_readers=[
            PeriodicExportingMetricReader(
                OTLPMetricExporter(),
            ),
        ]
    )
)

# instrument OpenAI
OpenAIInstrumentor().instrument()

TelemetryHandler,
)

handler = TelemetryHandler(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opentelemetry-util-genai = 0.2b0 only has tracer support.
In pyproject.toml you are restricting version,
"opentelemetry-util-genai >= 0.2b0, <0.3b0", incase of gen-ai utils future releases this will still break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this will break if we move to a higher version. We are currently using this pattern accross vertex and other instrumentations. I will check if we have to update version when we push a new gen ai util version.


def responses_create(
handler: "TelemetryHandler",
capture_content: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused ?


def responses_retrieve(
handler: "TelemetryHandler",
capture_content: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused ? since using _should_capture_content. Do we need is_content_enabled() ?

streaming = is_streaming(kwargs)

capture_content = _should_capture_content()
invocation = handler.start_llm(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we calling start_llm which is for llm calls. Is retrieve calling llm to fetch the response?
Also gen-ai utils's handler.start_llm will set the operation name as chat, looks like you are overriding it to retrieval here ? also see Retrieval which as separate attributes tham inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Instrument OpenAI Responses API

8 participants