Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Unable to use asyncpg in new library version #11852

Open
Apakottur opened this issue Jan 3, 2025 · 4 comments
Open

[BUG]: Unable to use asyncpg in new library version #11852

Apakottur opened this issue Jan 3, 2025 · 4 comments
Assignees
Labels

Comments

@Apakottur
Copy link

Tracer Version(s)

2.18.1

Python Version(s)

3.12.8

Pip Version(s)

pip 24.3.1

Bug Report

We are tracing our application which communicates with postgres via the asyncpg driver.
Everything works perfectly in ddtrace==2.17.3 which we are currently using, but upgrading to 2.18.1 results in our asyncpg driver constantly timing out, for example:

/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/sqlalchemy/engine/default.py:621: in connect
    return self.loaded_dbapi.connect(*cargs, **cparams)
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py:953: in connect
    await_only(creator_fn(*arg, **kw)),
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py:132: in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py:196: in greenlet_spawn
    value = await result
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/ddtrace/contrib/trace_utils_async.py:35: in wrapper
    return await func(mod, pin, wrapped, instance, args, kwargs)
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/ddtrace/contrib/internal/asyncpg/patch.py:103: in _traced_connect
    conn = _TracedConnection(await func(*args, **kwargs), pin)
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/asyncpg/connection.py:2420: in connect
    async with compat.timeout(timeout):
/usr/local/lib/python3.12/asyncio/timeouts.py:115: in __aexit__
    raise TimeoutError from exc_val
E   TimeoutError

Reproduction Code

So far I was not able to create a small reproducible example, this happens after the process runs for a while, performing many different DB queries. Will continue trying.

Error Logs

We're getting the following debug log from the tracer when the issue occurs:

DEBUG    ddtrace._trace.tracer:tracer.py:877 span <Span(id=16630326685907300418,trace_id=137532420521718859218382865815756886327,parent_id=12504371964807139896,name=api.http.request)> closing after its parent <Span(id=12504371964807139896,trace_id=137532420521718859218382865815756886327,parent_id=None,name=http.request)>, this is an error when not using async

Libraries in Use

aiobotocore==2.16.0
aiohappyeyeballs==2.4.4
aiohttp==3.11.11
aioitertools==0.12.0
aiosignal==1.3.2
amplitude-analytics==1.1.4
annotated-types==0.7.0
anyio==4.7.0
asgi-lifespan==2.1.0
asyncpg==0.30.0
asyncpg-stubs==0.30.0
attrs==24.3.0
boto3==1.35.81
botocore==1.35.81
botocore-stubs==1.35.90
bytecode==0.16.0
CacheControl==0.14.1
cachetools==5.5.0
certifi==2024.12.14
cffi==1.17.1
charset-normalizer==3.4.1
click==8.1.8
coverage==7.6.10
cryptography==44.0.0
ddtrace==2.18.1
Deprecated==1.2.15
dnspython==2.7.0
email_validator==2.2.0
emoji==2.14.0
envier==0.6.1
execnet==2.1.1
firebase-admin==6.6.0
freezegun==1.5.1
frozenlist==1.5.0
GeoAlchemy2==0.16.0
geographiclib==2.0
geopandas==1.0.1
geopy==2.4.1
google-api-core==2.24.0
google-api-python-client==2.156.0
google-auth==2.37.0
google-auth-httplib2==0.2.0
google-cloud-core==2.4.1
google-cloud-firestore==2.19.0
google-cloud-storage==2.19.0
google-crc32c==1.6.0
google-resumable-media==2.7.2
googleapis-common-protos==1.66.0
greenlet==3.1.1
grpcio==1.68.1
grpcio-status==1.68.1
h11==0.14.0
h3==4.1.2
httpcore==1.0.7
httplib2==0.22.0
httptools==0.6.4
httpx==0.28.1
idna==3.10
importlib_metadata==8.5.0
iniconfig==2.0.0
Jinja2==3.1.5
jmespath==1.0.1
jsonschema==4.23.0
jsonschema-path==0.3.3
jsonschema-specifications==2023.12.1
lazy-object-proxy==1.10.0
loguru==0.7.3
markdown-it-py==3.0.0
MarkupSafe==3.0.2
mdurl==0.1.2
moto==5.0.24
msgpack==1.1.0
multidict==6.1.0
mypy==1.14.0
mypy-extensions==1.0.0
nodeenv==1.9.1
numpy==2.2.1
openapi-schema-validator==0.6.2
openapi-spec-validator==0.7.1
opentelemetry-api==1.29.0
packaging==24.2
pandas==2.2.3
pandas-stubs==2.2.3.241126
pathable==0.4.3
pgpasslib==1.1.0
phonenumbers==8.13.52
pluggy==1.5.0
propcache==0.2.1
proto-plus==1.25.0
protobuf==5.29.2
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycparser==2.22
pydantic==2.10.4
pydantic-settings==2.7.0
pydantic_core==2.27.2
Pygments==2.18.0
pyinstrument==5.0.0
PyJWT==2.10.1
pyogrio==0.10.0
pyparsing==3.2.1
pyproj==3.7.0
pyright==1.1.391
pytest==8.3.4
pytest-asyncio==0.25.0
pytest-cov==6.0.0
pytest-httpx==0.35.0
pytest-mock==3.14.0
pytest-sugar==1.0.0
pytest-xdist==3.6.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
pytz==2024.2
PyYAML==6.0.2
referencing==0.35.1
requests==2.32.3
responses==0.25.3
rfc3339-validator==0.1.4
rich==13.9.4
rpds-py==0.22.3
rsa==4.9
s3transfer==0.10.4
shapely==2.0.6
shellingham==1.5.4
shpyx==0.0.32
six==1.17.0
sniffio==1.3.1
SQLAlchemy==2.0.35
starlette==0.43.0
tenacity==9.0.0
termcolor==2.5.0
timezonefinder==6.5.7
typeguard==4.4.1
typer==0.15.1
types-aiobotocore==2.16.0
types-aiobotocore-personalize-runtime==2.16.1
types-aiobotocore-rekognition==2.16.1
types-aiobotocore-s3==2.16.1.post1
types-aiobotocore-sqs==2.16.1
types-aiobotocore-ssm==2.16.1
types-aiobotocore-stepfunctions==2.16.1
types-awscrt==0.23.6
types-python-dateutil==2.9.0.20241206
types-pytz==2024.2.0.20241221
types-shapely==2.0.0.20241221
typing_extensions==4.12.2
tzdata==2024.2
uritemplate==4.1.1
urllib3==2.3.0
uvicorn==0.34.0
uvloop==0.21.0
watchfiles==1.0.3
websockets==14.1
Werkzeug==3.1.3
wrapt==1.17.0
xmltodict==0.14.2
yarl==1.18.3
zipp==3.21.0

Operating System

Linux 3a14a4cc78d4 6.10.14-linuxkit #1 SMP PREEMPT_DYNAMIC Fri Nov 29 17:24:06 UTC 2024 x86_64 GNU/Linux

@wantsui
Copy link
Collaborator

wantsui commented Feb 10, 2025

The error message in your log occurs in this part of the code:

def _on_span_finish(self, span: Span) -> None:
active = self.current_span()
# Debug check: if the finishing span has a parent and its parent
# is not the next active span then this is an error in synchronous tracing.
if span._parent is not None and active is not span._parent:
log.debug("span %r closing after its parent %r, this is an error when not using async", span, span._parent)

It looks like the affected spans are http.request and api.http.request (which appears to be a custom span).

DEBUG ddtrace._trace.tracer:tracer.py:877 span <Span(id=16630326685907300418,trace_id=137532420521718859218382865815756886327,parent_id=12504371964807139896,name=api.http.request)> closing after its parent <Span(id=12504371964807139896,trace_id=137532420521718859218382865815756886327,parent_id=None,name=http.request)>, this is an error when not using async

It's not clear to me how these two spans would interact with asyncpg

/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/ddtrace/contrib/internal/asyncpg/patch.py:103: in _traced_connect
conn = _TracedConnection(await func(*args, **kwargs), pin)
/opt/virtualenvs/api-NPeSkiFo-py3.12/lib/python3.12/site-packages/asyncpg/connection.py:2420: in connect
async with compat.timeout(timeout):
/usr/local/lib/python3.12/asyncio/timeouts.py:115: in aexit
raise TimeoutError from exc_val

Instead, I wonder if there's an interaction with asyncio? If so, there have been changes to it since 2.18.1: https://github.com/DataDog/dd-trace-py/releases?q=async&expanded=true .

Is there a staging environment where this issue occurs where you can try this with 2.20.0 to see if any of the asyncio changes fix it: https://github.com/DataDog/dd-trace-py/releases/tag/v2.20.0 ?

Regarding:

So far I was not able to create a small reproducible example, this happens after the process runs for a while, performing many different DB queries

I'm worried that if 2.20.0 doesn't show any specific improvements, we'll have to get the reproduction to figure out how to find the bug.

In case that happens, I do recommend a support ticket, referencing this Github issue so I can look at it from the other side.

Even if you can't create a small reproducible example, a small code repro that showcases how the app is configured, such as python tracer settings would be useful.

Other factors that I'm wondering about that may require details better shared over our support portal vs this public forum:

  • interaction of your custom spans with the pg calls
  • how long do you wait until this behavior occurs?
  • how much load does the application get when this behavior starts to occur?
  • what does the trace from this app generally look like?

@Apakottur
Copy link
Author

Apakottur commented Feb 17, 2025

Thanks for the detailed response @wantsui .

I tried version 2.20.1 on production and it seems more stable, much less errors than 2.17.3, but still not fully resolved.
I tried version 2.21.0 and it seems to be much worse than 2.20.1, getting errors almost immediately.

Here's a small reproducible example:

import asyncio
import datetime as dt

import freezegun
from ddtrace import patch_all
from sqlalchemy import text
from sqlalchemy.ext.asyncio import async_sessionmaker, create_async_engine

_HOST = "localhost"
_PORT = 5432
_USERNAME = ""
_PASSWORD = ""
_DB_NAME = ""

_DB_STRING = f"postgresql+asyncpg://{_USERNAME}:{_PASSWORD}@{_HOST}:{_PORT}/{_DB_NAME}"

async def _run():
    patch_all(asyncpg=True)

    engine = create_async_engine(_DB_STRING, isolation_level="AUTOCOMMIT")

    async_session = async_sessionmaker(engine, expire_on_commit=False)

    with freezegun.freeze_time(dt.datetime.now()):
        async with async_session() as session:
            result = await session.execute(text("select 1"))

    assert result.all() == [(1,)]
    print("Success!")

if __name__ == "__main__":
    asyncio.run(_run())

This code connects to a Postgres database and executes SELECT 1.
On ddtrace==2.17.3 the code runs without any issues and prints "Success!".
On ddtrace==2.21.0 the code gives:

Traceback (most recent call last):
  File ".../.venv/lib/python3.12/site-packages/asyncpg/connection.py", line 2421, in connect
    return await connect_utils._connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 1049, in _connect
    conn = await _connect_addr(
           ^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 886, in _connect_addr
    return await __connect_addr(params, True, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 931, in __connect_addr
    tr, pr = await connector
             ^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 802, in _create_ssl_connection
    tr, pr = await loop.create_connection(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 1080, in create_connection
    infos = await self._ensure_resolved(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 1456, in _ensure_resolved
    return await loop.getaddrinfo(host, port, family=family, type=type,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 901, in getaddrinfo
    return await self.run_in_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ".../a.py", line 34, in <module>
    asyncio.run(_run())
  File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File ".../a.py", line 27, in _run
    result = await session.execute(text("select 1"))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/ext/asyncio/session.py", line 463, in execute
    result = await greenlet_spawn(
             ^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 201, in greenlet_spawn
    result = context.throw(*sys.exc_info())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2365, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2241, in _execute_internal
    conn = self._connection_for_bind(bind)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 2110, in _connection_for_bind
    return trans._connection_for_bind(engine, execution_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in _connection_for_bind
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/orm/session.py", line 1189, in _connection_for_bind
    conn = bind.connect()
           ^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 3274, in connect
    return self._connection_cls(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 146, in __init__
    self._dbapi_connection = engine.raw_connection()
                             ^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 3298, in raw_connection
    return self.pool.connect()
           ^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 449, in connect
    return _ConnectionFairy._checkout(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 1263, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 712, in checkout
    rec = pool._do_get()
          ^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/impl.py", line 179, in _do_get
    with util.safe_reraise():
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/impl.py", line 177, in _do_get
    return self._create_connection()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 390, in _create_connection
    return _ConnectionRecord(self)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 674, in __init__
    self.__connect()
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 900, in __connect
    with util.safe_reraise():
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/pool/base.py", line 896, in __connect
    self.dbapi_connection = connection = pool._invoke_creator(self)
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/engine/create.py", line 646, in connect
    return dialect.connect(*cargs, **cparams)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 622, in connect
    return self.loaded_dbapi.connect(*cargs, **cparams)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 961, in connect
    await_only(creator_fn(*arg, **kw)),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 132, in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 196, in greenlet_spawn
    value = await result
            ^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/ddtrace/contrib/internal/trace_utils_async.py", line 35, in wrapper
    return await func(mod, pin, wrapped, instance, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/ddtrace/contrib/internal/asyncpg/patch.py", line 103, in _traced_connect
    conn = _TracedConnection(await func(*args, **kwargs), pin)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.venv/lib/python3.12/site-packages/asyncpg/connection.py", line 2420, in connect
    async with compat.timeout(timeout):
  File "/usr/lib/python3.12/asyncio/timeouts.py", line 115, in __aexit__
    raise TimeoutError from exc_val
TimeoutError

Do you get the same result when running this code?

@quinna-h
Copy link
Contributor

Hi @Apakottur
Thanks for getting back to us with the sample reproduction!

It looks like this error starts happening at tracer version 2.18.0 and later. We think it's related to changes in the freezegun integration introduced by #11406. The PR also enabled patching freezegun by default.

While we investigate further, could you change patch_all(asyncpg=True) to patch_all(freezegun=False)? asyncpg is patched by default, so you don't need to pass it into patch_all.

@quinna-h quinna-h assigned quinna-h and unassigned wantsui Feb 18, 2025
@Apakottur
Copy link
Author

Thanks for explanation @quinna-h !

I tried patch_all(freezegun=False) and I can confirm that it fixes my sample reproduction code.

So far this also seems to work well on our production environment, I'll update if we hit further issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants