-
Notifications
You must be signed in to change notification settings - Fork 1.5k
SQLServer Extended Event Handlers #20229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
550ae52
to
19a6803
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, really appreciate the thorough commenting. A few questions/comments then LGTM
@@ -997,6 +999,50 @@ files: | |||
value: | |||
example: false | |||
type: boolean | |||
- name: xe_collection | |||
description: | | |||
Available for Agent 7.67 and newer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We generally don't have comments like this since the spec file is in Git
description: | | ||
Configure the collection of completed queries from the `datadog_query_completions` XE session. | ||
|
||
Set `query_completions.enabled` to `true` to enable the collection of query completion events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can set these as descriptions of the properties themselves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is actually really annoying - we can't set descriptions on properties of an "object". You can set descriptions on individual options. The reason I'm using object here is to get the nested configuration of
- xe_collection
- query_completion
- enabled
- query_error
- enabled
- query_completion
We're not able to get this nesting if we use the "option" value. I'm going to keep this format unless we want to talk about this piece more.
return "" | ||
try: | ||
# Parse the timestamp | ||
if timestamp_str.endswith('Z'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there no existing method in Python for parsing timestamps like these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, dateutil parser has functionality for this. Updating.
def _query_ring_buffer(self): | ||
""" | ||
Query the ring buffer data and parse the XML on the client side. | ||
This avoids expensive server-side XML parsing for better performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What impact does this have on agent performance? Generally the agent is running on much smaller hardware than the database so if the performance hit is that severe will it cause problems with the agent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be heavy on the agent. Moving this parsing to agent side, I found that we can do the parsing much faster than the db, while reducing the ring buffer query time.
return None | ||
|
||
# Define the file path pattern | ||
file_path = f"d:\\rdsdbdata\\log\\{self.session_name}*.xel" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this path be configurable, i.e., is it always the same for all installations? I'm assuming the rds
in there is for AWS but what about non-AWS installs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a dead function right now, since event file is not configurable. Can either delete the function for now or add a comment that the code is not live
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely shouldn't deploy with dead code. I'd suggest moving it into some sort of scrap file or branch if you expect to need it in the future.
filtered_events = [] | ||
try: | ||
# Convert string to bytes for lxml | ||
xml_stream = BytesIO(xml_data.encode('utf-8')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the data returned always utf-8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the raw data is taken at raw_xml = str(row[0])
, which is already getting a properly decoded string. I just choose an encoding here to convert it to bytes for lxml parsing. Just in case though, I can add back up logic to encode in utf-16 if we hit encoding errors.
self._log.error(f"Error filtering ring buffer events: {e}") | ||
return [] | ||
|
||
def _extract_value(self, element, default=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: I would move all these XML functions into a separate file
|
||
return default | ||
|
||
def _extract_int_value(self, element, default=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there no Python functionality for these XML methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved these to their own file, and used xpath to grab the values instead, which is slightly more pythonic
|
||
# Log a sample of events (up to 3) for debugging | ||
if self._log.isEnabledFor(logging.DEBUG): | ||
sample_size = min(3, len(events)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configurable if it's for debugging?
continue | ||
|
||
# Create a single batched payload for all events if we have any | ||
if all_query_details: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How big can this data get? If it's in the MB, you may have to split these up to avoid hitting the payload max of 20MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
largest it should get is 2 MB, which is the ring buffer max size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we implement event file, we will definitely need to think about splitting these.
What does this PR do?
Implements the SQLServer Extended Event Handlers. This enables deobfuscation and query error visibility. This is a beefy PR so I will describe at a high level each component. See the RFC here
Configuration
XE Handler
Events emitted
Currently collects three types of query completion events, emitted as dbm_type=query_completion:
Collects two types of error events, emitted as dbm_type=query_error:
Emits RQT (Raw Query Text) events when collect_raw_query_statement.enabled is true:
Testing
Motivation
Get query error and deobfuscated query visibility for sqlserver. This is a targeted feature for Rockstar, but greatly strengthens DBM's sqlserver offering.
Review checklist (to be filled by reviewers)
qa/skip-qa
label if the PR doesn't need to be tested during QA.backport/<branch-name>
label to the PR and it will automatically open a backport PR once this one is merged