Skip to content

Add UnstructuredData and UncategorisedData concepts/properties #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
coolharsh55 opened this issue Feb 11, 2025 · 1 comment
Open

Add UnstructuredData and UncategorisedData concepts/properties #240

coolharsh55 opened this issue Feb 11, 2025 · 1 comment

Comments

@coolharsh55
Copy link
Collaborator

  1. Add property hasUnstructuredData to indicate data is "without a pref-defined data model or is not organised in a pre-defined manner" (source: Wikipedia). This is relevant to application of privacy/data protection regulations like GDPR, as unstructured data may contain PII/personal data. The range of the property would be dpv:Data, but to be more precise a new concept called UnstructuredData should be added as a subclass of dpv:Data.
  2. Since unstructured only refers to the underlying modelling of data, the more precise relation hasUncategorisedData should be added to indicate data that has not been categorised according to some (legally) relevant category - such as for indicating whether it is PersonalData or NonPersonalData (or both). The concept UncategorisedData as a subclass of dpv:Data would be range of this.
  3. The concepts relating to structure should be added to TECH extension as there may be additional related concepts (e.g. metadata) that we would want to add. The concepts relating to categorisation should be added to DPV alongside existing concepts as these are relevant for legal and organisational processes.
  4. Whether the concepts StructuredData and CategorisedData are needed should be discussed (e.g. it is necessary to flag unstructured and uncategorised data, but is the inverse necessary?). These concepts might be helpful to indicate the data is not unstructured and not uncategorised.
  5. The concept UncategorisedData would also be helpful for exploring the concept UnlabelledData to be added to TECH or AI extension.

Proposal with @brennanraj

@coolharsh55
Copy link
Collaborator Author

This was discussed in Meeting MAR-06 and accepted to be added. The specific definitions and uses of both concepts will need explanatory notes to suggest how they can be used, and further how they relate to uses in AI for labelled data.

@coolharsh55 coolharsh55 added WIP and removed proposal labels Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant