-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add related.entity
field
#2360
Open
romulets
wants to merge
5
commits into
elastic:main
Choose a base branch
from
romulets:add-related-entities
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add related.entity
field
#2360
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entity
is an extremely broad category. The danger with using this is it will mean different things to different people, and become a bucket that will hold almost anything.This would reduce the effectiveness of having a common schema, as this field will be used by different users to hold different types of data, and cause problems with writing queries, doing data normalization, etc. Already in the description, there's resource IDs, email addresses, and hostnames, which are three different things.
I think you'll need to consider the use-cases for this, and refine the definition of what this is intended to hold. Maybe just
cloud_resource_names
? Or have multiple fields for the different types of data that could be related.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Michael, I see where you coming from.
However, our need is very broad, indeed. What we wish is to be able to find any event related to an entity. What is an entity? Can be very much anything. A workstation. A bare metal machine. A user. An ec2 instance. A database. Pretty much anything a SoC team is concerned about.
But then why not specify
cloud_resource_name
orcloud_entity
? Ideally, from a user experience perspective, a user doesn't need to know all the ecs field types to search by something. Doesn't need to think twice or search before typing its search. I do see the point over data organisation on having very separated buckets, but from a search perspective, that decreases the experience. Beyond that, some concepts are just overlapping. We haverelated.host
,related.ips
which both hold information about a machine that can be seen as an entity. So where does the data about that specific host exist? We believe it would be easier to just have all the data inrelated.entity
and search from there.With that said, you mentioned that having it all in one field would reduce the effectiveness of data. Can you expand on that? Why would it cause problems writing queries and doing data normalisation?
Tagging @tinnytintin10 so he can give his cents as product (if he wishes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your thoughtful analysis of the proposal, @mjwolf!
You're right that "entity" is an extremely broad category, and that's intentional. Let me explain our reasoning and address your concerns:
Regarding alternatives (like the one I mentioned in the last bullet above), creating implicit entity fieldsets for each possible entity type would be a significant undertaking (especially in the cloud). If we were to follow the pattern of existing fields like "host" and "user", we'd quickly run into an explosion of entity types. Consider this non-exhaustive list of potential generic entity types we'd need to account for/introduce:
expand me
a few of these might have some ecs fields available...
This list doesn't even include some of the entity types we already have ECS fields for, such as those related to hosts, users, and Kubernetes (which ECS calls orchestrator).
Creating and maintaining fields for each of these entity types would not only take considerable time to implement but would also result in a proliferation of field types. This approach would place a substantial cognitive burden on users, requiring them to remember a large number of specific fields for different entity types.
The related.entity field addresses this challenge by providing a single, unified field for correlation. Users don't need to know the implicit entity type for each resource to correlate events, greatly simplifying the process. For instance, they wouldn't need to know that a bucket is for blob storage or that an ARN identifies an AWS resource - they could simply use related.entity to find all events related to that entity. i.e., related.entity offers a user-friendly way to correlate events across diverse entity types without overwhelming users or complicating the schema unnecessarily.
As we move forward, we'll continue to evaluate and adapt based on the evolving needs of our users and the insights we gain from this implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tinnytintin10 for the excellent explanation, I think this makes sense for achieving what you want to achieve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjwolf what do we need to wrap this PR up? If you approve we can merge or it must be discussed in other forums?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I've been looking into how could we implement this in OpenTelemetry. @trisch-me has been supporting me on the process. The summary is:
It's under discussion concepts that have some correlation with what we want to do. OTel have the concept of Resources, which "represents the entity producing telemetry as resource attributes". The concept of entity is under discussion here .
However, this discussion doesn't fully cover what we want because they are focusing mainly on resolving the problem of what entity produced the telemetry observation. What we want to observe is different, we want to know what entity has relation with the emitted event (actor or target). An event can have multiple related entities. And the emitting entity might have nothing to do with the information we would like. Example:
romulo
created the ec2 instancei-001
was emitted by the trailmonitor-elastic
. On this case, only the entitiesromulo
andi-001
are relevant to the security use case. The entity trailmonitor-elastic
is supporting information of how that event was reported. But no security interest there.I have discussed our use case both in Semantics SIG and Entity SIG. Both groups agreed that our use case is legit. But Entity SIG (the correct group to have this discussion) has other priorities to discuss right now. They suggested to us, as elastic, to open an OTep and be prepare our case to be discussed in the Entity SIG.
Because I believe this goes beyond just CDR, and other teams might have interest, I want to take this discussion further and find how we, as elastic, want to take this topic on. My next point of discussion will be at the Elastic OpenTelemetry Office Hours, where I'll raise what we want to do and see if more people in that group would have an interest.
There isn't, however any timeline available of when this discussion will properly start or end. @trisch-me and I agreed to merge this PR asynchronously from any OTel discussion or outcome, because the pace doesn't seem too promising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe such a broad topic will take months in Otel to at least formalise correctly, not saying about implementation.
Also I want to add that we are talking here only about entities inside related. But the whole concept of related namespaces will not be ported to the otel as is, there is no place for it in this format. So before we could have entities there we should think about parent namespace
related
first.Saying that I believe we should proceed with this topic in ECS first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tinnytintin10 hey, thank a lot for this explanation, but I still have doubts. Let's say I'm a developer who is using ECS for my case (for example we at Elastic are using ECS in Endgame), how should I detect if something from my event/log entry is an entity or not? Should I just throw into the bucket everything
I think
can be entity?I understand it's a broader topic, but I would like to have more clarity and examples there. Also to give everyone reading the info an understanding of the field and data put into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trisch-me do you have examples on what do you mean by
I think
can be entity? Not sure what are the use cases you thought of.I'm understanding entity as:
Machines, virtual or physical, are entities. Instances of tooling/services/components such as queueing topics/subscriptions, databases, networking components, object storages, authentication and authorization components (and others) are entities. So are Users and they representation in different systems, like Okta, AWS, Azure, Active Directory and others.
Did you think something I didn't mention here @trisch-me?
Regardless, I agree that we as elastic should have a formal definition written in our docs of what are entities, and what are not entities. That will help us move forward with less doubts and friction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I said
I think
it was a reference to a developer, who might not understand or know what entities are, especially if it's another non-security domain. So my request is to have concrete proposal on what entity is, directly in the description or notes. Even sentences you wrote above are better explanation than we currently have in proposal. We might also have this definition somewhere else and have just a link to it