Skip to content

Testing for telemetry #616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 268 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
268 commits
Select commit Hold shift + click to select a range
afcb0f0
[ES-402013] Close cursors before closing connection (#38)
Aug 23, 2022
af945aa
Bump version to 2.0.5 and improve CHANGELOG (#40)
Aug 23, 2022
441a6ae
fix dco issue
moderakh Aug 25, 2022
29fe6b4
fix dco issue
moderakh Aug 25, 2022
06d9df8
Merge pull request #42 from moderakh/fix-dco-issue
moderakh Aug 25, 2022
cf3130e
dco tunning
moderakh Aug 25, 2022
4387f93
dco tunning
moderakh Aug 25, 2022
285e516
Merge pull request #43 from moderakh/dco-tunning
moderakh Aug 25, 2022
ea0f076
Github workflows: run checks on pull requests from forks (#47)
Aug 26, 2022
616a5c8
OAuth implementation (#15)
moderakh Sep 14, 2022
e39d294
Automate deploys to Pypi (#48)
Sep 22, 2022
1ea2fe0
[PECO-205] Add functional examples (#52)
Sep 30, 2022
3638fa2
Bump version to 2.1.0 (#54)
Oct 1, 2022
1a4cf4b
[SC-110400] Enabling compression in Python SQL Connector (#49)
mohitsingla-db Oct 13, 2022
8d6d47f
Add tests for parameter sanitisation / escaping (#46)
Oct 14, 2022
3d3c692
Bump thrift dependency to 0.16.0 (#65)
Nov 8, 2022
5cbfcac
Bump version to 2.2.0 (#66)
Nov 17, 2022
c6e573c
Support Python 3.11 (#60)
Nov 28, 2022
7c53b76
Bump version to 2.2.1 (#70)
Nov 28, 2022
4f221b3
Add none check on _oauth_persistence in DatabricksOAuthProvider (#71)
jackyhu-db Dec 29, 2022
cfa38a1
Support custom oauth client id and redirect port (#75)
jackyhu-db Dec 29, 2022
2f2a761
Bump version to 2.2.2 (#76)
jackyhu-db Jan 3, 2023
def5e0e
Merge staging ingestion into main (#78)
Jan 10, 2023
3cc9393
Bump version to 2.3.0 and update changelog (#80)
Jan 10, 2023
aa55a6e
Add pkgutil-style for the package (#84)
lu-wang-dl Jan 27, 2023
ce158cb
Add SQLAlchemy Dialect (#57)
Feb 17, 2023
0ed7e53
Bump to version 2.4.0(#89)
Feb 21, 2023
9a06d6c
Fix syntax in examples in root readme. (#92)
shea-parkes Feb 27, 2023
20e789f
Less strict numpy and pyarrow dependencies (#90)
Mar 7, 2023
3a60599
Update example in docstring so query output is valid Spark SQL (#95)
Mar 21, 2023
e627649
Bump version to 2.4.1 (#96)
Mar 21, 2023
c43eaf8
Update CODEOWNERS (#97)
moderakh Mar 24, 2023
b0b6abd
Add Andre to CODEOWNERS (#98)
yunbodeng-db Mar 29, 2023
f440791
Add external auth provider + example (#101)
andrefurlan-db Apr 12, 2023
5f247e5
Retry on connection timeout (#103)
andrefurlan-db Apr 13, 2023
c1d9510
[PECO-244] Make http proxies work (#81)
Apr 14, 2023
c5731d8
Bump to version 2.5.0 (#104)
Apr 15, 2023
7087236
Fix changelog release date for version 2.5.0
Apr 15, 2023
61b6911
Relax sqlalchemy requirement (#113)
Apr 28, 2023
b5ab608
Update to version 2.5.1 (#114)
Apr 28, 2023
ad6fbd9
Fix SQLAlchemy timestamp converter + docs (#117)
May 9, 2023
73108e2
Relax pandas and alembic requirements (#119)
May 9, 2023
7d85814
Bump to version 2.5.2 (#118)
May 9, 2023
4077c7f
Use urllib3 for thrift transport + reuse http connections (#131)
Jun 7, 2023
cdf1857
Default socket timeout to 15 min (#137)
mattdeekay Jun 7, 2023
5539b26
Bump version to 2.6.0 (#139)
Jun 7, 2023
728e2b1
Fix: some thrift RPCs failed with BadStatusLine (#141)
Jun 8, 2023
eada549
Bump version to 2.6.1 (#142)
Jun 8, 2023
cdc50d2
[ES-706907] Retry GetOperationStatus for http errors (#145)
Jun 14, 2023
2904788
Bump version to 2.6.2 (#147)
Jun 14, 2023
782ebb6
[PECO-626] Support OAuth flow for Databricks Azure (#86)
jackyhu-db Jun 20, 2023
b7ada62
Use a separate logger for unsafe thrift responses (#153)
Jun 23, 2023
c6cf88f
Improve e2e test development ergonomics (#155)
Jun 23, 2023
95cf95b
Don't raise exception when closing a stale Thrift session (#159)
Jun 26, 2023
3680a0f
Bump to version 2.7.0 (#161)
Jun 26, 2023
ba2cd84
Cloud Fetch download handler (#127)
mattdeekay Jun 27, 2023
061c763
Cloud Fetch download manager (#146)
mattdeekay Jul 3, 2023
e8fc63b
Cloud fetch queue and integration (#151)
mattdeekay Jul 5, 2023
813c73c
Cloud Fetch e2e tests (#154)
mattdeekay Jul 7, 2023
d3f0513
Update changelog for cloudfetch (#172)
mattdeekay Jul 10, 2023
6786933
Improve sqlalchemy backward compatibility with 1.3.24 (#173)
Jul 11, 2023
203735f
OAuth: don't override auth headers with contents of .netrc file (#122)
Jul 12, 2023
bd08f58
Fix proxy connection pool creation (#158)
sebbegg Jul 12, 2023
9508c4f
Relax pandas dependency constraint to allow ^2.0.0 (#164)
itsdani Jul 12, 2023
8140be9
Use hex string version of operation ID instead of bytes (#170)
Jul 12, 2023
850235c
SQLAlchemy: fix has_table so it honours schema= argument (#174)
Jul 12, 2023
4c766ef
Fix socket timeout test (#144)
mattdeekay Jul 12, 2023
7fe5ddf
Disable non_native_boolean_check_constraint (#120)
bkyryliuk Jul 12, 2023
50dfd93
Remove unused import for SQLAlchemy 2 compatibility (#128)
WilliamGentry Jul 12, 2023
4b0b8bd
Bump version to 2.8.0 (#178)
Jul 21, 2023
f07df30
Fix typo in python README quick start example (#186)
dbarrundia-tiger Aug 9, 2023
683e03c
Configure autospec for mocked Client objects (#188)
Aug 9, 2023
d168598
Use urllib3 for retries (#182)
Aug 9, 2023
fcfe8f4
Bump version to 2.9.0 (#189)
Aug 10, 2023
972f7cc
Explicitly add urllib3 dependency (#191)
jacobus-herman Aug 10, 2023
1c3ce1e
Bump to 2.9.1 (#195)
Aug 11, 2023
667f719
Make backwards compatible with urllib3~=1.0 (#197)
Aug 16, 2023
ddf8a5f
Convenience improvements to v3 retry logic (#199)
Aug 17, 2023
56c7d41
Bump version to 2.9.2 (#201)
Aug 18, 2023
312c7b9
Github Actions Fix: poetry install fails for python 3.7 tests (#208)
Aug 24, 2023
9bc0d3e
Make backwards compatible with urllib3~=1.0 [Follow up #197] (#206)
Aug 24, 2023
33390db
Bump version to 2.9.3 (#209)
Aug 24, 2023
e176f65
Add note to sqlalchemy example: IDENTITY isn't supported yet (#212)
Aug 31, 2023
854c56f
[PECO-1029] Updated thrift compiler version (#216)
nithinkdb Sep 9, 2023
0d1d7d8
[PECO-1055] Updated thrift defs to allow Tsparkparameters (#220)
nithinkdb Sep 11, 2023
c32b71a
Update changelog to indicate that 2.9.1 and 2.9.2 have been yanked. (…
Sep 13, 2023
4588ff3
Fix changelog typo: _enable_v3_retries (#225)
Sep 18, 2023
b9bd2a1
Introduce SQLAlchemy reusable dialog tests (#125)
unj1m Sep 20, 2023
329b7ee
[PECO-1026] Add Parameterized Query support to Python (#217)
nithinkdb Sep 22, 2023
9489087
Parameterized queries: Add e2e tests for inference (#227)
Sep 25, 2023
b94f59e
[PECO-1109] Parameterized Query: add suport for inferring decimal typ…
Sep 26, 2023
9592098
SQLAlchemy 2: reorganise dialect files into a single directory (#231)
Sep 26, 2023
84a6cbc
[PECO-1083] Updated thrift files and added check for protocol version…
nithinkdb Sep 29, 2023
9d93e1b
[PECO-840] Port staging ingestion behaviour to new UC Volumes (#235)
Sep 30, 2023
ef5fbda
Query parameters: implement support for binding NoneType parameters (…
Sep 30, 2023
f138703
SQLAlchemy 2: Bump dependency version and update e2e tests for existi…
Oct 2, 2023
04c99e4
Revert "[PECO-1083] Updated thrift files and added check for protocol…
Oct 2, 2023
cbe21e5
SQLAlchemy 2: add type compilation for all CamelCase types (#238)
Oct 2, 2023
77a8886
SQLAlchemy 2: add type compilation for uppercase types (#240)
Oct 2, 2023
4a70379
SQLAlchemy 2: Stop skipping all type tests (#242)
Oct 10, 2023
0e791ba
[PECO-1134] v3 Retries: allow users to bound the number of redirects …
Oct 10, 2023
f198a25
Parameters: Add type inference for BIGINT and TINYINT types (#246)
Oct 11, 2023
d975611
SQLAlchemy 2: Stop skipping some non-type tests (#247)
Oct 13, 2023
a596776
SQLAlchemy 2: implement and refactor schema reflection methods (#249)
Oct 13, 2023
16a5106
Add GovCloud domain into AWS domains (#252)
jackyhu-db Oct 17, 2023
ca84f1a
SQLAlchemy 2: Refactor __init__.py into base.py (#250)
Oct 18, 2023
45c6073
SQLAlchemy 2: Finish implementing all of ComponentReflectionTest (#251)
Oct 18, 2023
3a8b4ea
SQLAlchemy 2: Finish marking all tests in the suite (#253)
Oct 18, 2023
8a0ec56
SQLAlchemy 2: Finish organising compliance test suite (#256)
Oct 23, 2023
4905952
SQLAlchemy 2: Fix failing mypy checks from development (#257)
Oct 23, 2023
7444425
Enable cloud fetch by default (#258)
Oct 25, 2023
9a8ac88
[PECO-1137] Reintroduce protocol checking to Python test fw (#248)
nithinkdb Oct 25, 2023
6bc7413
sqla2 clean-up: make sqlalchemy optional and don't mangle the user-ag…
Oct 28, 2023
95e5595
SQLAlchemy 2: Add support for TINYINT (#265)
Oct 31, 2023
c69d886
Add OAuth M2M example (#266)
jackyhu-db Oct 31, 2023
012f6ed
Native Parameters: reintroduce INLINE approach with tests (#267)
Nov 1, 2023
b09ff05
Document behaviour of executemany (#213)
martinitus Nov 1, 2023
fd4336e
SQLAlchemy 2: Expose TIMESTAMP and TIMESTAMP_NTZ types to users (#268)
Nov 1, 2023
f3081a5
Drop Python 3.7 as a supported version (#270)
Nov 1, 2023
ff51bfb
GH Workflows: remove Python 3.7 from the matrix for _all_ workflows (…
Nov 9, 2023
ca000db
Add README and updated example for SQLAlchemy usage (#273)
Nov 16, 2023
6aa7890
Rewrite native parameter implementation with docs and tests (#281)
Nov 16, 2023
bf084fe
Enable v3 retries by default (#282)
Nov 17, 2023
23b51c9
security: bump pyarrow dependency to 14.0.1 (#284)
Nov 17, 2023
5a1acdc
Bump package version to 3.0.0 (#285)
Nov 17, 2023
e768d48
Fix docstring about default parameter approach (#287)
Falydoor Nov 21, 2023
505a522
[PECO-1286] Add tests for complex types in query results (#293)
Nov 29, 2023
5c01874
sqlalchemy: fix deprecation warning for dbapi classmethod (#294)
Nov 29, 2023
2027145
[PECO-1297] sqlalchemy: fix: can't read columns for tables containing…
Nov 30, 2023
9e963a0
Prepared 3.0.1 release (#297)
Dec 1, 2023
f703d81
Make contents of `__init__.py` equal across projects (#304)
pietern Dec 26, 2023
bdd2cb6
Fix URI construction in ThriftBackend (#303)
NodeJSmith Jan 23, 2024
00b8d3e
[sqlalchemy] Add table and column comment support (#329)
Jan 25, 2024
a6e81ed
Pin pandas and urllib3 versions to fix runtime issues in dbt-databric…
benc-db Jan 25, 2024
c89da23
SQLAlchemy: TINYINT types didn't reflect properly (#315)
TimTheinAtTabs Jan 25, 2024
6482c76
[PECO-1435] Restore `tests.py` to the test suite (#331)
Jan 26, 2024
d20d931
Bump to version 3.0.2 (#335)
Jan 26, 2024
e3e0f49
Update some outdated OAuth comments (#339)
jackyhu-db Jan 30, 2024
456fec5
Redact the URL query parameters from the urllib3.connectionpool logs …
mkazia-db Feb 2, 2024
01cfc66
Bump to version 3.0.3 (#344)
jackyhu-db Feb 2, 2024
9ff99b8
[PECO-1411] Support Databricks OAuth on GCP (#338)
jackyhu-db Feb 5, 2024
072ef2c
[PECO-1414] Support Databricks native OAuth in Azure (#351)
jackyhu-db Feb 13, 2024
f52c658
Prep for Test Automation (#352)
benc-db Feb 14, 2024
b1bd792
Update code owners (#345)
yunbodeng-db Feb 14, 2024
70f3738
Reverting retry behavior on 429s/503s to how it worked in 2.9.3 (#349)
benc-db Feb 15, 2024
912127c
Bump to version 3.1.0 (#358)
jackyhu-db Feb 16, 2024
1ed5c9d
[PECO-1440] Expose current query id on cursor object (#364)
kravets-levko Mar 4, 2024
1577506
Add a default for retry after (#371)
benc-db Mar 14, 2024
e01ef74
Fix boolean literals (#357)
aholyoke Mar 14, 2024
7cfd6f6
Don't retry network requests that fail with code 403 (#373)
Mar 15, 2024
6cf12fb
Bump to 3.1.1 (#374)
benc-db Mar 19, 2024
02d08d6
Fix cookie setting (#379)
benc-db Mar 27, 2024
4122597
Fixing a couple type problems: how I would address most of #381 (#382)
wyattscarpenter Apr 2, 2024
3631e55
fix the return types of the classes' __enter__ functions (#384)
wyattscarpenter Apr 2, 2024
4b1b7ad
Add Kravets Levko to codeowners (#386)
kravets-levko Apr 15, 2024
f2d927b
Prepare for 3.1.2 (#387)
benc-db Apr 18, 2024
2d2f8f7
Update the proxy authentication (#354)
amir-haroun May 23, 2024
d9802a8
Fix failing tests (#392)
kravets-levko May 28, 2024
9c158d9
Relax `pyarrow` pin (#389)
dhirschfeld May 29, 2024
0400bdb
Fix log error in oauth.py (#269)
susodapop May 29, 2024
683a033
Enable `delta.feature.allowColumnDefaults` for all tables (#343)
dhirschfeld May 30, 2024
3a68fa8
Fix SQLAlchemy tests (#393)
kravets-levko May 30, 2024
94a2597
Add more debug logging for CloudFetch (#395)
kravets-levko Jun 6, 2024
3a50d70
Update Thrift package (#397)
m1n0 Jun 12, 2024
37d8a7b
Prepare release 3.2.0 (#396)
kravets-levko Jun 13, 2024
0017b0c
move py.typed to correct places (#403)
wyattscarpenter Jul 2, 2024
2a1875a
Upgrade mypy (#406)
wyattscarpenter Jul 3, 2024
9fd4a25
Do not retry failing requests with status code 401 (#408)
Hodnebo Jul 3, 2024
74bcc86
[PECO-1715] Remove username/password (BasicAuth) auth option (#409)
jackyhu-db Jul 4, 2024
e7c0c06
[PECO-1751] Refactor CloudFetch downloader: handle files sequentially…
kravets-levko Jul 11, 2024
677483d
Fix CloudFetch retry policy to be compatible with all `urllib3` versi…
kravets-levko Jul 11, 2024
512efca
Disable SSL verification for CloudFetch links (#414)
kravets-levko Jul 16, 2024
1a1497b
Prepare relese 3.3.0 (#415)
kravets-levko Jul 17, 2024
b751088
Fix pandas 2.2.2 support (#416)
kfollesdal Jul 26, 2024
4959197
[PECO-1801] Make OAuth as the default authenticator if no authenticat…
jackyhu-db Aug 1, 2024
7467860
[PECO-1857] Use SSL options with HTTPS connection pool (#425)
kravets-levko Aug 22, 2024
2de70ec
Prepare release v3.4.0 (#430)
kravets-levko Aug 27, 2024
2675099
[PECO-1926] Create a non pyarrow flow to handle small results for the…
jprakash-db Oct 3, 2024
c755ecc
[PECO-1961] On non-retryable error, ensure PySQL includes useful info…
shivam2680 Oct 3, 2024
1a44d91
Reformatted all the files using black (#448)
jprakash-db Oct 3, 2024
92dff6c
Prepare release v3.5.0 (#457)
jackyhu-db Oct 18, 2024
b4bcf8a
[PECO-2051] Add custom auth headers into cloud fetch request (#460)
jackyhu-db Oct 25, 2024
28a0fe6
Prepare release 3.6.0 (#461)
jackyhu-db Oct 25, 2024
82efe73
[ PECO - 1768 ] PySQL: adjust HTTP retry logic to align with Go and N…
jprakash-db Nov 20, 2024
5e11582
[ PECO-2065 ] Create the async execution flow for the PySQL Connector…
jprakash-db Nov 26, 2024
a9ae775
Fix for check_types github action failing (#472)
jprakash-db Nov 26, 2024
c251d91
Remove upper caps on dependencies (#452)
arredond Dec 5, 2024
8a63786
Updated the doc to specify native parameters in PUT operation is not …
jprakash-db Dec 6, 2024
9d6813b
Incorrect rows in inline fetch result (#479)
jprakash-db Dec 22, 2024
8468a2b
Bumped up to version 3.7.0 (#482)
jprakash-db Dec 23, 2024
aa673ac
PySQL Connector split into connector and sqlalchemy (#444)
jprakash-db Dec 27, 2024
fd7f85c
Removed CI CD for python3.8 (#490)
jprakash-db Jan 17, 2025
b20c55b
Added CI CD upto python 3.12 (#491)
jprakash-db Jan 18, 2025
d61a964
Merging changes from v3.7.1 release (#488)
jprakash-db Jan 18, 2025
efd82fb
Bumped up to version 4.0.0 (#493)
jprakash-db Jan 22, 2025
ed19388
Updated action's version (#455)
newwingbird Feb 27, 2025
f8f9f4e
Support Python 3.13 and update deps (#510)
dhirschfeld Feb 27, 2025
0e51281
Improve debugging + fix PR review template (#514)
samikshya-db Mar 2, 2025
9665a74
Forward porting all changes into 4.x.x. uptil v3.7.3 (#529)
jprakash-db Mar 7, 2025
b24ddd7
Updated the CODEOWNERS (#531)
jprakash-db Mar 7, 2025
0013ba4
Add version check for urllib3 in backoff calculation (#526)
shivam2680 Mar 11, 2025
851d23b
[ES-1372353] make user_agent_header part of public API (#530)
shivam2680 Mar 12, 2025
f321b49
Updates runner used to run DCO check to use databricks-protected-runn…
madhav-db Mar 12, 2025
b000892
Support multiple timestamp formats in non arrow flow (#533)
jprakash-db Mar 18, 2025
2553bcf
prepare release for v4.0.1 (#534)
shivam2680 Mar 19, 2025
078f41b
Relaxed bound for python-dateutil (#538)
jprakash-db Apr 1, 2025
adc2c86
Bumped up the version for 4.0.2 (#539)
jprakash-db Apr 1, 2025
f9fe172
Added example for async execute query (#537)
jprakash-db Apr 1, 2025
6f99449
Added urllib3 version check (#547)
jprakash-db Apr 21, 2025
6790dca
Bump version to 4.0.3 (#549)
jprakash-db Apr 22, 2025
3a4d6d3
Cleanup fields as they might be deprecated/removed/change in the futu…
vikrantpuppala May 9, 2025
557bb68
Refactor decimal conversion in PyArrow tables to use direct casting (…
jayantsing-db May 12, 2025
9a3f946
[PECOBLR-361] convert column table to arrow if arrow present (#551)
shivam2680 May 16, 2025
7233e4e
Update CODEOWNERS (#562)
jprakash-db May 21, 2025
b88eba0
Enhance Cursor close handling and context manager exception managemen…
madhav-db May 21, 2025
14c8a7e
PECOBLR-86 improve logging on python driver (#556)
saishreeeee May 22, 2025
8013a0d
Update github actions run conditions (#569)
jprakash-db May 26, 2025
fdd385f
Added classes required for telemetry (#572)
saishreeeee May 30, 2025
9dc7d52
E2E POC for python telemetry for connect logs (#581)
saishreeeee Jun 10, 2025
ce2cc1a
Merge branch 'main' into HEAD
saishreeeee Jun 17, 2025
99ec875
Merge branch 'main' into telemetry
saishreeeee Jun 17, 2025
cf89ce3
Added functionality for export of failure logs (#591)
saishreeeee Jun 19, 2025
380b0b9
bugfix: stalling test issue (close in TelemetryClientFactory) (#609)
saishreeeee Jun 23, 2025
67a8497
added multithreaded tests, exeception handling tests
saishreeeee Jun 25, 2025
23d8881
Updated tests (#614)
jprakash-db Jun 24, 2025
350e745
Add test to check thrift field IDs (#602)
vikrantpuppala Jun 24, 2025
4a2356d
Revert "Enhance Cursor close handling and context manager exception m…
madhav-db Jun 24, 2025
97df72e
Bump version to 4.0.5 (#615)
madhav-db Jun 24, 2025
76e60fe
Merge branch 'telemetry' into telemetry-testing
saishreeeee Jun 25, 2025
6748c2c
Merge branch 'main' into telemetry
saishreeeee Jun 25, 2025
5a84e11
Merge branch 'telemetry' into telemetry-testing
saishreeeee Jun 25, 2025
70fd810
used batch size instead of default batch size
saishreeeee Jul 1, 2025
0dfe0f4
Add functionality for export of latency logs via telemetry (#608)
saishreeeee Jul 3, 2025
6c5d6ba
Merge branch 'telemetry' into telemetry-testing
saishreeeee Jul 4, 2025
3e9b47d
tests
saishreeeee Jul 4, 2025
11d41ce
test
saishreeeee Jul 4, 2025
10375a8
Merge branch 'main' into telemetry
saishreeeee Jul 7, 2025
8c0f474
Revert "Merge branch 'main' into telemetry"
saishreeeee Jul 7, 2025
13ebfb4
Revert "Revert "Merge branch 'main' into telemetry""
saishreeeee Jul 7, 2025
79db09f
workflows
saishreeeee Jul 7, 2025
5005561
-
saishreeeee Jul 7, 2025
50e771b
actual e2e
saishreeeee Jul 7, 2025
ea86fe2
temp
saishreeeee Jul 7, 2025
96813ad
changed enums to follow proto, get_extractor returns None if not Curs…
saishreeeee Jul 8, 2025
846e701
formatting
saishreeeee Jul 8, 2025
d939fe3
auth mech test fix
saishreeeee Jul 8, 2025
cac9c7a
import logging
saishreeeee Jul 8, 2025
1355283
Merge branch 'telemetry' into telemetry-testing
saishreeeee Jul 8, 2025
c106b95
actual send telemetry
saishreeeee Jul 8, 2025
1f13936
merge main
saishreeeee Jul 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions tests/e2e/test_concurrent_telemetry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
import threading
from unittest.mock import patch
import pytest

from databricks.sql.telemetry.telemetry_client import TelemetryClient, TelemetryClientFactory
from tests.e2e.test_driver import PySQLPytestTestCase

def run_in_threads(target, num_threads, pass_index=False):
"""Helper to run target function in multiple threads."""
threads = [
threading.Thread(target=target, args=(i,) if pass_index else ())
for i in range(num_threads)
]
for t in threads:
t.start()
for t in threads:
t.join()


class TestE2ETelemetry(PySQLPytestTestCase):

@pytest.fixture(autouse=True)
def telemetry_setup_teardown(self):
"""
This fixture ensures the TelemetryClientFactory is in a clean state
before each test and shuts it down afterward. Using a fixture makes
this robust and automatic.
"""
# --- SETUP ---
if TelemetryClientFactory._executor:
TelemetryClientFactory._executor.shutdown(wait=True)
TelemetryClientFactory._clients.clear()
TelemetryClientFactory._executor = None
TelemetryClientFactory._initialized = False

yield # This is where the test runs

# --- TEARDOWN ---
if TelemetryClientFactory._executor:
TelemetryClientFactory._executor.shutdown(wait=True)
TelemetryClientFactory._executor = None
TelemetryClientFactory._initialized = False

def test_concurrent_queries_sends_telemetry(self):
"""
An E2E test where concurrent threads execute real queries against
the staging endpoint, while we capture and verify the generated telemetry.
"""
num_threads = 5
captured_telemetry = []
captured_telemetry_lock = threading.Lock()
captured_responses = []
captured_responses_lock = threading.Lock()

original_send_telemetry = TelemetryClient._send_telemetry
original_callback = TelemetryClient._telemetry_request_callback

def send_telemetry_wrapper(self_client, events):
with captured_telemetry_lock:
captured_telemetry.extend(events)
original_send_telemetry(self_client, events)

with patch.object(TelemetryClient, "_send_telemetry", send_telemetry_wrapper):

def execute_query_worker(thread_id):
"""Each thread creates a connection and executes a query."""
with self.connection(extra_params={"enable_telemetry": True}) as conn:
with conn.cursor() as cursor:
cursor.execute(f"SELECT {thread_id}")
cursor.fetchall()

# Run the workers concurrently
run_in_threads(execute_query_worker, num_threads, pass_index=True)

if TelemetryClientFactory._executor:
TelemetryClientFactory._executor.shutdown(wait=True)

# --- VERIFICATION ---
assert len(captured_telemetry) == num_threads * 3 # 4 events per thread (initial_telemetry_log, 2 latency_logs (execute, fetchall))

events_with_latency = [
e for e in captured_telemetry
if e.entry.sql_driver_log.operation_latency_ms is not None and e.entry.sql_driver_log.sql_statement_id is not None
]
assert len(events_with_latency) == num_threads * 2 # 2 events per thread (execute, fetchall)
177 changes: 175 additions & 2 deletions tests/unit/test_telemetry.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,16 @@
import pytest
import requests
from unittest.mock import patch, MagicMock
import threading
import random
import time
from concurrent.futures import ThreadPoolExecutor

from databricks.sql.telemetry.telemetry_client import (
TelemetryClient,
NoopTelemetryClient,
TelemetryClientFactory,
TelemetryHelper,
BaseTelemetryClient
)
from databricks.sql.telemetry.models.enums import AuthMech, AuthFlow
from databricks.sql.auth.authenticators import (
Expand Down Expand Up @@ -281,4 +284,174 @@ def test_factory_shutdown_flow(self):
# Close second client - factory should shut down
TelemetryClientFactory.close(session2)
assert TelemetryClientFactory._initialized is False
assert TelemetryClientFactory._executor is None
assert TelemetryClientFactory._executor is None

# A helper function to run a target in multiple threads and wait for them.
def run_in_threads(target, num_threads, pass_index=False):
"""Creates, starts, and joins a specified number of threads.

Args:
target: The function to run in each thread
num_threads: Number of threads to create
pass_index: If True, passes the thread index (0, 1, 2, ...) as first argument
"""
threads = [
threading.Thread(target=target, args=(i,) if pass_index else ())
for i in range(num_threads)
]
for t in threads:
t.start()
for t in threads:
t.join()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add these in a separate file? @jprakash-db what's the sop in python?

class TestTelemetryRaceConditions:
"""Tests for race conditions in multithreaded scenarios."""

@pytest.fixture(autouse=True)
def clean_factory(self):
"""A fixture to automatically reset the factory's state before each test."""
# Clean up at the start of each test
if TelemetryClientFactory._executor:
TelemetryClientFactory._executor.shutdown(wait=True)
TelemetryClientFactory._clients.clear()
TelemetryClientFactory._executor = None
TelemetryClientFactory._initialized = False

yield

# Clean up at the end of each test
if TelemetryClientFactory._executor:
TelemetryClientFactory._executor.shutdown(wait=True)
TelemetryClientFactory._clients.clear()
TelemetryClientFactory._executor = None
TelemetryClientFactory._initialized = False

def test_factory_concurrent_initialization_of_DIFFERENT_clients(self):
"""
Tests that multiple threads creating DIFFERENT clients concurrently
share a single ThreadPoolExecutor and all clients are created successfully.
"""
num_threads = 20

def create_client(thread_id):
TelemetryClientFactory.initialize_telemetry_client(
telemetry_enabled=True,
session_id_hex=f"session_{thread_id}",
auth_provider=None,
host_url="test-host",
)

run_in_threads(create_client, 20, pass_index=True)

# ASSERT: The factory was properly initialized
assert TelemetryClientFactory._initialized is True
assert TelemetryClientFactory._executor is not None
assert isinstance(TelemetryClientFactory._executor, ThreadPoolExecutor)

# ASSERT: All clients were successfully created
assert len(TelemetryClientFactory._clients) == num_threads

# ASSERT: All TelemetryClient instances share the same executor
telemetry_clients = [
client for client in TelemetryClientFactory._clients.values()
if isinstance(client, TelemetryClient)
]
assert len(telemetry_clients) == num_threads

shared_executor = TelemetryClientFactory._executor
for client in telemetry_clients:
assert client._executor is shared_executor

def test_factory_concurrent_initialization_of_SAME_client(self):
"""
Tests that multiple threads trying to initialize the SAME client
result in only one client instance being created.
"""
session_id = "shared-session"
num_threads = 20

def create_same_client():
TelemetryClientFactory.initialize_telemetry_client(
telemetry_enabled=True,
session_id_hex=session_id,
auth_provider=None,
host_url="test-host",
)

run_in_threads(create_same_client, num_threads)

# ASSERT: Only one client was created in the factory.
assert len(TelemetryClientFactory._clients) == 1
client = TelemetryClientFactory.get_telemetry_client(session_id)
assert isinstance(client, TelemetryClient)

def test_client_concurrent_event_export(self):
"""
Tests that no events are lost when multiple threads call _export_event
on the same client instance concurrently.
"""
client = TelemetryClient(True, "session-1", None, "host", MagicMock())
# Mock _flush to prevent auto-flushing when batch size threshold is reached
original_flush = client._flush
client._flush = MagicMock()

num_threads = 5
events_per_thread = 10

def add_events():
for i in range(events_per_thread):
client._export_event(f"event-{i}")

run_in_threads(add_events, num_threads)

# ASSERT: The batch contains all events from all threads, none were lost.
total_expected_events = num_threads * events_per_thread
assert len(client._events_batch) == total_expected_events

# Restore original flush method for cleanup
client._flush = original_flush

def test_client_concurrent_flush(self):
"""
Tests that if multiple threads trigger _flush at the same time,
the underlying send operation is only called once for the batch.
"""
client = TelemetryClient(True, "session-1", None, "host", MagicMock())
client._send_telemetry = MagicMock()

# Pre-fill the batch so there's something to flush
client._events_batch = ["event"] * 5

def call_flush():
client._flush()

run_in_threads(call_flush, 10)

# ASSERT: The send operation was called exactly once.
# This proves the lock prevents multiple threads from sending the same batch.
client._send_telemetry.assert_called_once()
# ASSERT: The event batch is now empty.
assert len(client._events_batch) == 0

def test_factory_concurrent_create_and_close(self):
"""
Tests that concurrently creating and closing different clients
doesn't corrupt the factory state and correctly shuts down the executor.
"""
num_ops = 50

def create_and_close_client(i):
session_id = f"session_{i}"
TelemetryClientFactory.initialize_telemetry_client(
telemetry_enabled=True, session_id_hex=session_id, auth_provider=None, host_url="host"
)
# Small sleep to increase chance of interleaving operations
time.sleep(random.uniform(0, 0.01))
TelemetryClientFactory.close(session_id)

run_in_threads(create_and_close_client, num_ops, pass_index=True)

# ASSERT: After all operations, the factory should be empty and reset.
assert not TelemetryClientFactory._clients
assert TelemetryClientFactory._executor is None
assert not TelemetryClientFactory._initialized
Loading