Skip to content

Fix broken Otel propagation in dspy ParallelExecutor #8505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TomeHirata
Copy link
Collaborator

Currently, dspy.Parallel has an issue with the tree structure of MLflow tracing gets broken when dspy.Parallel is used. This is because Dspy doesn't copy the parent context when workers are executed in parallel. It is intentional not to copy the entire contextvars in order to maintain different contextvar across different threads. This PR manually propagates the Otel Context to child threads when ParallelExecutor is executed to fix the tracing issue.

class ParallelExample(dspy.Module):
  def __init__(self):
      self.keyword_extractor = dspy.Predict("text -> keywords")
      self.sentiment_extractor = dspy.Predict("text -> sentiment")
      self.parallel = dspy.Parallel(num_threads=2)

  def forward(self, text):
      input_example = dspy.Example(text=text).with_inputs("text")

      results = self.parallel([
          (self.keyword_extractor, input_example),
          (self.sentiment_extractor, input_example)
      ])

      return results

# Usage
processor = ParallelExample()
result = processor(text="I love using DSPy! It makes AI programming so much easier.")

print(result)
image

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ensures OpenTelemetry context is propagated to threads in ParallelExecutor, fixes MLflow trace hierarchy, and adds a test to verify the behavior.

  • Introduce _with_otel_context decorator and apply it to the worker function in ParallelExecutor
  • Add test_otel_context_propagation to cover OpenTelemetry context forwarding
  • Update pyproject.toml to include OpenTelemetry dependencies

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/utils/test_parallelizer.py Added OpenTelemetry context propagation test
pyproject.toml Added opentelemetry-api and opentelemetry-sdk
dspy/utils/parallelizer.py Implemented _with_otel_context and applied decorator to worker
Comments suppressed due to low confidence (1)

tests/utils/test_parallelizer.py:64

  • The test references ParallelExecutor but does not import it, causing a NameError. Add: from dspy.utils.parallelizer import ParallelExecutor.
@pytest.mark.skipif(not importlib.util.find_spec("opentelemetry"), reason="OpenTelemetry not installed")

Copy link
Collaborator

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomeHirata I am a bit unsure about this one. DSPy, as an authoring framework may not want to be aware of the downstream tracing library. In the other language, I kinda want to avoid having opentelemetry code inside DSPy critical paths. As an analogy, PyTorch/Tensorflow development doesn't keep aware of Lightning/Keras despite of the close integration. If we want to maintain the hierarchy of the tracing spans, maybe we should patch it at the mlflow side? The implementation is a bit tricky though.

Btw, I am also feeling a bit strange to display all modules' tracing under the same Parallel span, since normally that represents the time order, but these child spans are equivalent in this case.

@TomeHirata
Copy link
Collaborator Author

TomeHirata commented Jul 10, 2025

@chenmoneygithub

. DSPy, as an authoring framework, may not want to be aware of the downstream tracing library. In the other language, I kinda want to avoid having OpenTelemetry code inside DSPy critical paths

I completely share the same feeling, and I first looked into addressing this on the callback side. However, the copying of otel context or contextvars needs to happen when child threads are spawned (in PerallelExecuter._execute_parallel). Other libraries tackled this issue by copying the context when running multi-threading (https://github.com/langchain-ai/langchain/blob/73552883c370b72b109910dcb4a85bdea786af90/libs/core/langchain_core/runnables/config.py#L166), but this would break the DSPy setting, unfortunately. Of course, I'm open to any better strategies as long as we can track the tree-structure correctly.

Btw, I am also feeling a bit strange to display all modules' tracing under the same Parallel span, since normally that represents the time order, but these child spans are equivalent in this case.

Parent-child relationship and absolute execution time are slightly different. It's correct to have module executions just under the parallelizer call since they are invoked directly by the top level. The absolute execution time is stored as a span metadata and the timeline view can tell they are executed in parallel.
image

@chenmoneygithub
Copy link
Collaborator

The absolute execution time is stored as a span metadata and the timeline view can tell they are executed in parallel.

Yes that part is good, I was actually talking about a different thing - in the screenshot in the PR description, there are two predict spans under ParallelExample.forward. Normally if there are two spans under the same parent span, it means the second span starts after the first one, which is different from this case. It's not a problem for us because we know how dspy.ParallelExecutor works and use tracing to help us investigate, but users may come from the reversed world - they have no idea about how dspy API functions, and use tracing to help them understand. This could be a bit strange. This is not really blocking though, but something to consider since this PR is clearly a tradeoff.

I need to think a bit more about the fix, definitely a challenging one.

@TomeHirata
Copy link
Collaborator Author

Normally if there are two spans under the same parent span, it means the second span starts after the first one, which is different from this case.

It's true in a single-thread case, but not always true in multi-threading. In the multi-threading case, the invocation stack can be stored as a tree structure of spans, but we need to check the timeline view to understand when these spans are executed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants