You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When multiple tenants simultaneously request a DIDExchange connection with an issuer's public DID, several unhandled exceptions are raised, causing all requested connections to fail.
The handling logic and auto-complete flows associated with the DIDExchange request do not report any error to the clients that made the request, leaving their connection record in the request-sent state.
The issuer does not receive any request-received records as expected - not even one of the many requests.
Note: this is running the latest ACA-Py release, with askar 0.4.3
Steps to Reproduce
There are many steps required to reproduce this in acapy alone... so the simplest way to reproduce this would be to check out our acapy-cloud repo (previously aries-cloudapi-python), where a simple test script can do all the setup and replicate it for you: https://github.com/didx-xyz/acapy-cloud
As a summary - besides all the steps for onboarding an issuer, and registering their public DID - here's how to replicate this issue:
Create multiple tenants (reliably fails for me with 10)
For each one, initiate a DIDExchange connection request (POST /didexchange/create-request) using use_public_did to set the issuer's public DID for the request.
Observe the unhandled exceptions raised in the system logs.
Check the state of the connection records for the tenants and the issuer.
This will get you a test file which you can check out at app/tests/e2e/test_many_connections.py
Spin up the stack: mise run tilt:up, and wait for services to be up and running (visit localhost:10350)
Run the test: pytest app/tests/e2e/test_many_connections.py
Click on the Multitenant-Agent tab in the Tilt UI (localhost:10350) to view logs
The test should fail with "Connection 0 failed with exception" and then "expected webhook not received".
Under Multitenant-Agent logs, you'll see many exceptions being raised, one for each request.
The stack trace seems to reveal that it's to do with a timeout waiting to open an askar session:
2025-02-04 11:14:49,084 acapy_agent.core.dispatcher ERROR Handler error: Dispatcher.handle_v1_message
Traceback (most recent call last):
File "/usr/local/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
return await fut
^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/aries_askar/store.py", line 773, in _open
await bindings.session_start(self._store, self._profile, self._is_txn),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/aries_askar/bindings/__init__.py", line 266, in session_start
handle = await invoke_async(
^^^^^^^^^^^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/aries_askar/bindings/lib.py", line 393, in invoke_async
return await self.loaded.invoke_async(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/futures.py", line 289, in __await__
yield self # This tells Task to wait for completion.
^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/tasks.py", line 385, in __wakeup
future.result()
File "/usr/local/lib/python3.12/asyncio/futures.py", line 197, in result
raise self._make_cancelled_error()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/asyncio/tasks.py", line 314, in __step_run_and_handle_result
result = coro.send(None)
^^^^^^^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/core/dispatcher.py", line 257, in handle_v1_message
await handler(context, responder)
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/protocols/didexchange/v1_0/handlers/request_handler.py", line 36, in handle
conn_rec = await mgr.receive_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/protocols/didexchange/v1_0/manager.py", line 558, in receive_request
conn_rec = await self._receive_request_public_did(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/protocols/didexchange/v1_0/manager.py", line 704, in _receive_request_public_did
await self._extract_and_record_did_doc_info(request)
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/protocols/didexchange/v1_0/manager.py", line 725, in _extract_and_record_did_doc_info
await self.store_did_document(conn_did_doc)
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/connections/base_manager.py", line 380, in store_did_document
async with self._profile.session() as session:
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/core/profile.py", line 197, in __aenter__
await self._setup()
File "/home/aries/.local/lib/python3.12/site-packages/acapy_agent/askar/profile.py", line 252, in _setup
self._handle = await asyncio.wait_for(self._opener, 10)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/tasks.py", line 519, in wait_for
async with timeouts.timeout(timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/timeouts.py", line 115, in __aexit__
raise TimeoutError from exc_val
TimeoutError
PS: Log levels can be modified in helm/acapy-cloud/conf/local/multitenant-agent.yaml, e.g. set ACAPY_LOG_LEVEL to debug
Please let me know if the replication steps are successful or not, or whether you need help with the acapy-cloud mise setup.
The text was updated successfully, but these errors were encountered:
ff137
added a commit
to didx-xyz/acapy-cloud
that referenced
this issue
Feb 4, 2025
When multiple tenants simultaneously request a DIDExchange connection with an issuer's public DID, several unhandled exceptions are raised, causing all requested connections to fail.
The handling logic and auto-complete flows associated with the DIDExchange request do not report any error to the clients that made the request, leaving their connection record in the
request-sent
state.The issuer does not receive any
request-received
records as expected - not even one of the many requests.Note: this is running the latest ACA-Py release, with askar 0.4.3
Steps to Reproduce
There are many steps required to reproduce this in acapy alone... so the simplest way to reproduce this would be to check out our
acapy-cloud
repo (previouslyaries-cloudapi-python
), where a simple test script can do all the setup and replicate it for you: https://github.com/didx-xyz/acapy-cloudAs a summary - besides all the steps for onboarding an issuer, and registering their public DID - here's how to replicate this issue:
POST /didexchange/create-request
) usinguse_public_did
to set the issuer's public DID for the request.The above steps can be achieved:
app/tests/e2e/test_many_connections.py
mise run tilt:up
, and wait for services to be up and running (visit localhost:10350)pytest app/tests/e2e/test_many_connections.py
The test should fail with "Connection 0 failed with exception" and then "expected webhook not received".
Under Multitenant-Agent logs, you'll see many exceptions being raised, one for each request.
The stack trace seems to reveal that it's to do with a timeout waiting to open an askar session:
PS: Log levels can be modified in
helm/acapy-cloud/conf/local/multitenant-agent.yaml
, e.g. setACAPY_LOG_LEVEL
todebug
Please let me know if the replication steps are successful or not, or whether you need help with the acapy-cloud mise setup.
The text was updated successfully, but these errors were encountered: