Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[o365] Mapping, parsing of o365.audit fields Platform and Data.* #8571

Merged
merged 6 commits into from
Jan 17, 2024

Conversation

chrisberkhout
Copy link
Contributor

@chrisberkhout chrisberkhout commented Nov 24, 2023

Proposed commit message

[o365] Mapping, parsing of o365.audit fields Platform and Data.* (#8571)

The Data value was a JSON blob mapped as a keyword. Authoritative
documentation of its content could not be found but we have a list of
known fields and types that match feedback from multiple users. Data
is now parsed and indexed as flattened in `o365.audit.Data.flattened`,
and known fields are explicitly mapped in `o365.audit.Data.*`.

The Platform field is documented as appearing in AIP (Azure Information
Protection) events and may appear elsewhere. Known value codes are
converted to corresponding strings, although values may appear directly
as strings.

Data notes

According to the Office 365 Management Activity API documentation, the Data parameter appears in events for the Security and Compliance Alerts schema and the Automated investigation and response events in Office 365, Main investigation schema. It is described only as "The detailed data blob of the alert or alert entity" or "Data string which contains more details about investigation entities, and information about alerts related to the investigation. Entities are available in a separate node within the data blob."

A list of fields and types inside the Data JSON blob was provided by a user in #4319 (comment). Example values I could find matched the user-supplied list.

Known example values for the Data parameter

I found a couple of examples in this forum post:

{
  "f3u": "[email protected]",
  "ts": "2022-01-07T01:46:00.0000000Z",
  "te": "2022-01-07T01:47:00.0000000Z",
  "op": "eDiscoverySearchStartedOrExported",
  "wl": "SecurityComplianceCenter",
  "tid": "cannot be shared",
  "tdc": "1",
  "reid": "cannot be shared",
  "rid": "cannot be shared",
  "cid": "cannot be shared",
  "ad": "The alert is triggered when users start content searches or eDiscovery searches or when search results are downloaded or exported -V1.0.0.1",
  "lon": "eDiscoverySearchStartedOrExported",
  "an": "eDiscovery search started or exported",
  "sev": "Informational"
}
{
  "etype": "MalwareFamily",
  "at": "2022-01-07T10:30:12.2317606Z",
  "md": "2022-01-04T15:58:44.0000000Z",
  "sip": "X.X.X.X",
  "ms": "John, Demo Vonage | Receive a JBL Charge 4 Waterproof Bluetooth Speaker",
  "imsgid": "<0100017e25d05d3d-99c96535-9e78-4fb4-98c0-2e50a900657e-000000@tiedtlawemail.amazonses.com>",
  "ttdt": "2022-01-07T10:30:12.2317606Z",
  "ttr": "Success_MessageQuarantined",
  "dm": "Campaign",
  "eid": "e67adg1s-f89d-173a-13ed-04f8va3a2111-4534686441061676516-1",
  "aii": "g45klj7f-f45u-376e-12fg-9089gr9b1333",
  "thn": "Phish, Malicious",
  "ts": "2022-01-07T10:29:12.2317606Z",
  "te": "2022-01-07T10:31:12.2317606Z",
  "fvs": "Filters",
  "tpt": "HostedContentFilterPolicy",
  "tpid": "d4y2c90c-2fce-4890f-a3e2-3f17896ba6889",
  "tid": "4e123dd-a89c-23d1-9b45-abbab3596529",
  "tht": "Phish, Malicious",
  "trc": "[email protected]",
  "tsd": "[email protected]",
  "tdc": "1",
  "cpid": "CCG8G11D.F1RR4TY.4BE345D4.R0EF7865.200B2",
  "lon": "Protection"
}

And a couple more in existing test data for this integration:

{
  "etype": "User",
  "eid": "[email protected]",
  "tid": "b86ab9d4-fcf1-4b11-8a06-7a8f91b47fbd",
  "ts": "2020-02-14T18:54:45.0000000Z",
  "te": "2020-02-14T18:54:45.0000000Z",
  "op": "GrantAdminPermission",
  "tdc": "1",
  "suid": "[email protected]",
  "ut": "Admin",
  "lon": "GrantAdminPermission"
}
{
  "f3u": "[email protected]",
  "ts": "2020-02-14T18:45:00.0000000Z",
  "te": "2020-02-14T19:00:00.0000000Z",
  "op": "GrantAdminPermission",
  "wl": "Exchange",
  "tid": "b86ab9d4-fcf1-4b11-8a06-7a8f91b47fbd",
  "tdc": "1",
  "reid": "23a5e271-e297-4f35-ff57-08d7b17f5bf2",
  "rid": "f81f1b69-dc60-4ded-918e-e17d5c73b29f",
  "cid": "17d51759-88e1-40c1-8df3-20bcf2e43057",
  "ad": "This alert is triggered when someone in your organization becomes an Exchange admin or gets new Exchange admin permissions -V1.0.0.1",
  "lon": "GrantAdminPermission",
  "an": "Elevation of Exchange admin privilege",
  "sev": "Low"
}

Platform notes

According to the Office 365 Management Activity API schema documentation, the Platform parameter appears various AIP (Azure Information Protection) events. It is described there as "Device platform (Win, OSX, Android, iOS)", or "The platform on which the activity happened. For example, Windows."

A user reported the field in an event with "Workload": "PublicEndpoint", "Application": "Outlook" and "Operation": "SensitivityLabelApplied", however this example came via the deprecated o365audit input rather than the current cel input, and is not covered in the Office 365 Management Activity API documentation.

The reported values were both strings and integer codes.

There are other pages in the Office 365 Management API documentation that provide more details about AIP events. This content can be found in the Aip*.md files in MicrosoftDocs/office-365-management-api. It lists integer codes and their corresponding meanings, so those codes are substituted where found.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

elastic-package stack up -d
elastic-package build -v && elastic-package install -v
elastic-package test -v

Related issues

@chrisberkhout chrisberkhout self-assigned this Nov 24, 2023
@chrisberkhout chrisberkhout requested a review from a team as a code owner November 24, 2023 16:46
@elasticmachine
Copy link

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@elasticmachine
Copy link

elasticmachine commented Nov 24, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-12-11T10:07:54.721+0000

  • Duration: 16 min 3 sec

Test stats 🧪

Test Results
Failed 0
Passed 26
Skipped 0
Total 26

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Nov 24, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (1/1) 💚
Classes 100.0% (1/1) 💚
Methods 100.0% (17/17) 💚
Lines 80.844% (785/971)
Conditionals 100.0% (0/0) 💚

@chrisberkhout chrisberkhout changed the title Mapping and parsing of o365.audit.Data.* and o365.audit.Platform [o365] Mapping and parsing of o365.audit.Data.*, .Platform Nov 24, 2023
Comment on lines 70 to 77
multi_fields:
- name: sip
type: ip
- name: at
type: date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this kind of multi-field approach valid?

Copy link
Contributor Author

@chrisberkhout chrisberkhout Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no. I thought I'd found a good way to leave the full data where it was and expose individual fields with the least surprising names, but looking in Kibana I'm only seeing the one flattened field so I think that's not the way to go.

Now I'm planning to do it like this:

  • o365.audit.Data type group (of parsed fields of types ip, date, and keyword)
  • o365.audit.DataRest type flattened (fields not broken out as o365.audit.Data.*)

And I guess now it should definitely get a major version bump, because it's not just a type change and field additions, it's moving data that was under o365.audit.Data to elsewhere.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in a new commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In GCP I did this (actually this, but the ideas are similar). The situation here does not need to be conditional like there, but I think consistent and discoverable field naming is a good idea.

I agree, given that the fields are moving, this would be a major bump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks, that looks good. Done here in a new commit.

@chrisberkhout chrisberkhout changed the title [o365] Mapping and parsing of o365.audit.Data.*, .Platform [o365] Mapping, parsing of o365.audit fields Platform and Data.* Nov 27, 2023
Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but please get opinion from @P1llus.

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

Copy link
Member

@P1llus P1llus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some minor comments or thoughts, though feel free to let me know if they don't make sense :)

@@ -1090,6 +1108,64 @@ processors:
field: o365audit.YammerNetworkId
type: string
ignore_missing: true
- json:
field: o365audit.Data
if: ctx.o365audit?.containsKey('Data')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can confirm this is never a string? You could always change this to startsWith a bracket if you are unsure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation isn't that explicit but based on examples and online discussion it seems to always be JSON. I'll add the startsWith check anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I think it's better to let that processor fail if it's not JSON data. It does seem to always be JSON, and if we're parsing and indexing it it doesn't seem worth retaining a field for a non-JSON string.

Comment on lines +1135 to +1168
- convert:
field: o365audit.Data.sip
type: ip
ignore_missing: true
- date:
field: o365audit.Data.at
target_field: o365audit.Data.at
formats:
- ISO8601
if: ctx.o365audit?.Data?.at != null
- date:
field: o365audit.Data.md
target_field: o365audit.Data.md
formats:
- ISO8601
if: ctx.o365audit?.Data?.md != null
- date:
field: o365audit.Data.te
target_field: o365audit.Data.te
formats:
- ISO8601
if: ctx.o365audit?.Data?.te != null
- date:
field: o365audit.Data.ts
target_field: o365audit.Data.ts
formats:
- ISO8601
if: ctx.o365audit?.Data?.ts != null
- date:
field: o365audit.Data.ttdt
target_field: o365audit.Data.ttdt
formats:
- ISO8601
if: ctx.o365audit?.Data?.ttdt != null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these matches any date type ECS fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some might, but I haven't been able to find any documentation of these fields.

Users want access to them but we don't have dependable information about their meaning, which may also vary significantly.

- json:
field: o365audit.Data
if: ctx.o365audit?.containsKey('Data')
- rename:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no real reason to still keep the string, so let's just remove it instead? We are moving it to a separate field, breaking it for anyone that might have needed to unpack it either way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that there might be other keys in there that we don't know about.

It's a pity that o365.audit.Data changes type, but o365.audit.Data.knownfield and o365.audit.Data.flattened.otherfield seems better than

  • having the known fields under some other prefix
  • losing the full data
  • using a pattern for this scenario that's different than this / this.

Comment on lines +142 to +143
"lon": "GrantAdminPermission",
"op": "GrantAdminPermission",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple of these fields look like good candidates for things like event.action, unless that is already set with a better value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we do have "event.action": "AlertTriggered". More generally, we don't have documentation for the individual fields and although we have a list of typical keys, the meanings may vary.

@chrisberkhout
Copy link
Contributor Author

@P1llus I responded to each point, but in summary:
The o365.audit.Data field does seem to always be JSON and clients want better access to the fields in it. We don't have a definitive list of the field names or meanings so we're indexing the keys and types we know, and making others available in the full object in the o365.audit.Data.flattened field. That pattern of specific fields and a flattened catchall matches what we've done in GCP and I believe elsewhere.

@mrodm
Copy link
Contributor

mrodm commented Dec 21, 2023

Hi @chrisberkhout, please update your branch with the latest contents from main branch. There was an important PR merged updating the CI pipelines. Thanks!

@chrisberkhout chrisberkhout force-pushed the o365-audit-data-fields branch from ed1e5d0 to aea1939 Compare January 17, 2024 10:57
@elasticmachine
Copy link

💚 Build Succeeded

cc @chrisberkhout

@chrisberkhout chrisberkhout merged commit 78def5f into elastic:main Jan 17, 2024
3 checks passed
@chrisberkhout chrisberkhout deleted the o365-audit-data-fields branch January 17, 2024 14:32
@elasticmachine
Copy link

Package o365 - 2.0.0 containing this change is available at https://epr.elastic.co/search?package=o365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Integration:o365 Microsoft Office 365
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add o365.audit.Platform to O356 integration templates
5 participants