-
Notifications
You must be signed in to change notification settings - Fork 586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data-classification.md extension #1317
base: main
Are you sure you want to change the base?
Add data-classification.md extension #1317
Conversation
`confidential`, `restricted`. | ||
- Constraints: | ||
- REQUIRED | ||
- SHOULD be applicable to data protection regulation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on what this "SHOULD" means? What does someone need to do (from a coding perspective) to adhere to this "SHOULD"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SHOULD statement is merely meant as an indication towards event producers that the data classification label should have its origin within the applicable data-regulation. But maybe this is stating the obvious and from a coding perspective not relevant. Being already stated in the description, it does not add value. I will remove it
`datacategory` attributes MAY be set to provide additional details on the | ||
classification context. | ||
|
||
Intermediaries and consumers SHOULD take these attributes into account and act |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this "SHOULD" should be a "MUST" instead? Should a consumer reject a request if it can't meet the data regulation requirements? Are clients expecting some kind of guarantee? Meaning, a non-error means "yup, got it and it'll be protected appropriately". Although, extensions can be ignored... maybe it would need to be worded like: "If an implementation supports this extension, then it MUST reject the event if it can not adhere to the requirements of the specified data classification attributes" ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises an interesting possibility, which is too late for v1 but could be interesting in a future version: if an event could say "consumers/intermediaries must understand extensions x, y and z, and must otherwise reject/ignore the event" then we could be stricter. (So that would be an attribute that's part of the main spec, but the values of which would be names of extension attributes.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes changing this section to be more prescriptive towards consumers is warranted. When an implementation supports this extension, an event MUST be handled in a compliant manner or otherwise MUST be rejected/ignored.
I will adjust the phrasing.
Can you update the README in the "extensions" dir too? |
- Type: `String` | ||
- Description: Data classification level for the event payload within the | ||
context of a `dataregulation`. Typical labels are: `public`, `internal`, | ||
`confidential`, `restricted`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect these values are probably defined by the data regulations being adhered to, but since dataregulation
is optional, should this spec define some recommended values for cases where it's missing to provide some consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I feel that is a good approach. I did not want to make the dataregulation
attribute required as I feel this is supportive information and not directly mandatory for processing. My intent is that usage of this extension should be as light as possible, meaning less required attributes as possible.
What do you think of:
Description: Data classification level for the event payload within the context of a dataregulation
. In a situation where dataregulation
is undefined, recommended labels are: public
, internal
, confidential
, or restricted
.
`datacategory` attributes MAY be set to provide additional details on the | ||
classification context. | ||
|
||
Intermediaries and consumers SHOULD take these attributes into account and act |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises an interesting possibility, which is too late for v1 but could be interesting in a future version: if an event could say "consumers/intermediaries must understand extensions x, y and z, and must otherwise reject/ignore the event" then we could be stricter. (So that would be an attribute that's part of the main spec, but the values of which would be names of extension attributes.)
@@ -0,0 +1,89 @@ | |||
# Data Classification Extension | |||
|
|||
CloudEvents might contain payload which is subjected to data protection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/payload/payloads/ & s/is/are/
or
s/payload/a payload/
but I prefer the former
CloudEvents might contain payload which is subjected to data protection | ||
regulations like GDPR or HIPAA. For intermediaries and consumers knowing how | ||
event payload is classified, which data protection regulation applies and how | ||
payload is categorized, enables compliant processing of an event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/payload is/payloads are/
- Description: Data classification level for the event payload within the | ||
context of a `dataregulation`. In situations where `dataregulation` is | ||
undefined or the data protection regulation does not define any labels, then | ||
recommended labels are: `public`, `internal`, `confidential`, or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/recommended/RECOMMENDED/
For example: `GDPR`, `HIPAA`, `PCI-DSS`, `ISO-27001`, `NIST-800-53`, `CCPA`. | ||
- Constraints: | ||
- OPTIONAL | ||
- if present, MUST be a non-empty string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anal but... since it's a string with commas that need to be parsed, we may want to add something like "leading and trailing spaces around each entry MUST be ignored. Spaces within an entry MAY exist but MUST be reduced down to a single space for comparison purposes". Are spaces allowed in the entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct spaces are not allowed
Examples where data classification of events can be useful are: | ||
|
||
- When an event contains PII or restricted information and therefore processing | ||
by intermediaries or consumers MUST adhere to certain policies. For example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/MUST/need to/ since this isn't a normative section, it's just examples.
Not an expert in this space but it LGTM with the minor edits I just commented on. |
@rob-sessink while not 100% necessary, can you rebase this on the latest 'main' branch so that the tests will run successfully for you? |
Signed-off-by: Rob Sessink <[email protected]>
…README.md and usage of MUST keyword in example use case - Signed-off-by: Rob Sessink <[email protected]>
Signed-off-by: Rob Sessink <[email protected]>
…bels, remove 'applicability constraints', extend usage section. - Signed-off-by: Rob Sessink <[email protected]>
b891424
to
b22870d
Compare
…onventions - Signed-off-by: Rob Sessink <[email protected]>
woo hoo - tests again! thanks for the rebase. Ping @jskeet for another look |
Signed-off-by: Rob Sessink <[email protected]>
|
||
- Type: `String` | ||
- Description: A comma-delimited list of applicable data protection regulations. | ||
For example: `GDPR`, `HIPAA`, `PCI-DSS`, `ISO-27001`, `NIST-800-53`, `CCPA`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this potentially goes down a rabbit-hole of trying to maintain catalogs but is there value is formalizing some of the regulation codes or referencing some well-known external catalog (if one exists).
In addition, does the applicability of some of these regulations vary by jurisdiction? if so, does that need to be represented in some fashion ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Jem's concerns are reasonable, but I don't know an appropriate resolution. (It may be that there's already a standards body defining these.)
When an implementation supports this extension, then intermediaries and | ||
consumers MUST take these attributes into account and act accordingly to data | ||
regulations and/or internal policies in processing the event and payload. If | ||
intermediaries or consumers cannot meet such requirements, they MUST reject or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should a client do if they know "support" this extension, but see values they don't know about (e.g. a new dataregulation value)? We may want to adjust this to say "if you don't know you can meet requirements, you should assume you can't".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I have adjust the phrasing. Please have a quick look.
…s when intermediaries/consumers encounter unknown attribute values. - Signed-off-by: Rob Sessink <[email protected]>
Signed-off-by: Rob Sessink <[email protected]>
Provides an extension where an event source can annotate an event with
information around data classification of an event and its payload. CloudEvents
may contain payload which is subjected to data protection regulations like GDPR
or HIPAA. For intermediaries and consumers knowing how event payload is
classified enables compliant processing of an event.
Adds an extension with attributes:
payload within the context of a data protection regulation.
context of data classification and data protection regulation.