Fix broken Otel propagation in dspy ParallelExecutor #8505

TomeHirata · 2025-07-08T06:42:54Z

Currently, dspy.Parallel has an issue with the tree structure of MLflow tracing gets broken when dspy.Parallel is used. This is because Dspy doesn't copy the parent context when workers are executed in parallel. It is intentional not to copy the entire contextvars in order to maintain different contextvar across different threads. This PR manually propagates the Otel Context to child threads when ParallelExecutor is executed to fix the tracing issue.

class ParallelExample(dspy.Module):
  def __init__(self):
      self.keyword_extractor = dspy.Predict("text -> keywords")
      self.sentiment_extractor = dspy.Predict("text -> sentiment")
      self.parallel = dspy.Parallel(num_threads=2)

  def forward(self, text):
      input_example = dspy.Example(text=text).with_inputs("text")

      results = self.parallel([
          (self.keyword_extractor, input_example),
          (self.sentiment_extractor, input_example)
      ])

      return results

# Usage
processor = ParallelExample()
result = processor(text="I love using DSPy! It makes AI programming so much easier.")

print(result)

Copilot

Pull Request Overview

This PR ensures OpenTelemetry context is propagated to threads in ParallelExecutor, fixes MLflow trace hierarchy, and adds a test to verify the behavior.

Introduce _with_otel_context decorator and apply it to the worker function in ParallelExecutor
Add test_otel_context_propagation to cover OpenTelemetry context forwarding
Update pyproject.toml to include OpenTelemetry dependencies

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

File	Description
tests/utils/test_parallelizer.py	Added OpenTelemetry context propagation test
pyproject.toml	Added `opentelemetry-api` and `opentelemetry-sdk`
dspy/utils/parallelizer.py	Implemented `_with_otel_context` and applied decorator to `worker`

Comments suppressed due to low confidence (1)

tests/utils/test_parallelizer.py:64

The test references ParallelExecutor but does not import it, causing a NameError. Add: from dspy.utils.parallelizer import ParallelExecutor.

@pytest.mark.skipif(not importlib.util.find_spec("opentelemetry"), reason="OpenTelemetry not installed")

dspy/utils/parallelizer.py

chenmoneygithub

@TomeHirata I am a bit unsure about this one. DSPy, as an authoring framework may not want to be aware of the downstream tracing library. In the other language, I kinda want to avoid having opentelemetry code inside DSPy critical paths. As an analogy, PyTorch/Tensorflow development doesn't keep aware of Lightning/Keras despite of the close integration. If we want to maintain the hierarchy of the tracing spans, maybe we should patch it at the mlflow side? The implementation is a bit tricky though.

Btw, I am also feeling a bit strange to display all modules' tracing under the same Parallel span, since normally that represents the time order, but these child spans are equivalent in this case.

TomeHirata · 2025-07-10T01:20:40Z

@chenmoneygithub

. DSPy, as an authoring framework, may not want to be aware of the downstream tracing library. In the other language, I kinda want to avoid having OpenTelemetry code inside DSPy critical paths

I completely share the same feeling, and I first looked into addressing this on the callback side. However, the copying of otel context or contextvars needs to happen when child threads are spawned (in PerallelExecuter._execute_parallel). Other libraries tackled this issue by copying the context when running multi-threading (https://github.com/langchain-ai/langchain/blob/73552883c370b72b109910dcb4a85bdea786af90/libs/core/langchain_core/runnables/config.py#L166), but this would break the DSPy setting, unfortunately. Of course, I'm open to any better strategies as long as we can track the tree-structure correctly.

Btw, I am also feeling a bit strange to display all modules' tracing under the same Parallel span, since normally that represents the time order, but these child spans are equivalent in this case.

Parent-child relationship and absolute execution time are slightly different. It's correct to have module executions just under the parallelizer call since they are invoked directly by the top level. The absolute execution time is stored as a span metadata and the timeline view can tell they are executed in parallel.

chenmoneygithub · 2025-07-10T05:40:13Z

The absolute execution time is stored as a span metadata and the timeline view can tell they are executed in parallel.

Yes that part is good, I was actually talking about a different thing - in the screenshot in the PR description, there are two predict spans under ParallelExample.forward. Normally if there are two spans under the same parent span, it means the second span starts after the first one, which is different from this case. It's not a problem for us because we know how dspy.ParallelExecutor works and use tracing to help us investigate, but users may come from the reversed world - they have no idea about how dspy API functions, and use tracing to help them understand. This could be a bit strange. This is not really blocking though, but something to consider since this PR is clearly a tradeoff.

I need to think a bit more about the fix, definitely a challenging one.

TomeHirata · 2025-07-10T06:19:12Z

Normally if there are two spans under the same parent span, it means the second span starts after the first one, which is different from this case.

It's true in a single-thread case, but not always true in multi-threading. In the multi-threading case, the invocation stack can be stored as a tree structure of spans, but we need to check the timeline view to understand when these spans are executed.

fix broken Otel propagation in dspy ParallelExecutor

bd49e45

TomeHirata requested review from okhat, krypticmouse and Copilot July 8, 2025 06:42

Copilot AI reviewed Jul 8, 2025

View reviewed changes

dspy/utils/parallelizer.py Show resolved Hide resolved

use @wraps

bc7abf0

chenmoneygithub reviewed Jul 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix broken Otel propagation in dspy ParallelExecutor #8505

Fix broken Otel propagation in dspy ParallelExecutor #8505

Uh oh!

TomeHirata commented Jul 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

chenmoneygithub left a comment

Uh oh!

TomeHirata commented Jul 10, 2025 •

edited

Loading

Uh oh!

chenmoneygithub commented Jul 10, 2025

Uh oh!

TomeHirata commented Jul 10, 2025

Uh oh!

Uh oh!

Fix broken Otel propagation in dspy ParallelExecutor #8505

Are you sure you want to change the base?

Fix broken Otel propagation in dspy ParallelExecutor #8505

Uh oh!

Conversation

TomeHirata commented Jul 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

chenmoneygithub left a comment

Choose a reason for hiding this comment

Uh oh!

TomeHirata commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenmoneygithub commented Jul 10, 2025

Uh oh!

TomeHirata commented Jul 10, 2025

Uh oh!

Uh oh!

TomeHirata commented Jul 10, 2025 •

edited

Loading