-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support workflow resource type in Resource Catalogue (Q2) #56
Comments
@kalxas What is meant by workflow? Do we want to specify this further (e.g., CWL, OpenEO process graph)? |
Related: I have trawled the pycsw and pygeoapi repos for sample resources of various type - no workflows there, though, afaics: |
@GarinSmith to link to related EarthCODE story, pls. |
Hi @j08lue, In summary: After an initial review with Angelos. We agreed that we should use OGC API Records to
This is important because it means (as Richard suggested)
We would like a formal way of validating a schema. Can you please suggest something? E.g. we would like EOEPCA+ guidance on how to validate schema compliance? This seems quite complicated. The online schemas to not seem to cope with $ref instances and there seem to be lots of $refs for OGC API Records. I have looked at command lined solutions like Polyglottal JSON Schema Validator and these seem to struggle too. Could you provide a working example/solution to validate a valid OGC API Record? We could then use this approach in EarthCODE using the above strategy. For above I used (schemas) |
Sure thing. @kalxas, let us discuss, how much Records validation should happen on the API vs UI level. |
Thanks. We would like to know first a reliable way to perform this validation, so that: |
This would mean validating against, directly: https://github.com/opengeospatial/ogcapi-records/blob/master/core/openapi/schemas/recordGeoJSON.yaml . The problem here is that the schema is in YAML, and tools like Python OGC typically pushes out the YAML schemas onto http://schemas.opengis.net/. We need JSON schemas. |
@jonas-eberle @j08lue any type of workflow could be represented with a metadata record. The goal of this task is to define a record schema with extra properties to describe metadata about a workflow |
@GarinSmith I got feedback from @tomkralidis |
@kalxas Thanks. That helps. I don't care about the format (JSON or YAML) as long as it validates against a specific OGC API Records implementation that we can use for a workflow or experiment. I had the same problem above using YAML and $ref. It is very helpful to see the spec referenced here too https://schemas.wmo.int/wcmp/2.0.0/standard/wcmp-2.0.0.pdf I got this to work using I tried it with https://github.com/opengeospatial/ogcapi-records/blob/master/core/examples/json/record.json and got I also tried it with an example openEO implementation I am wondering why a workflow would require |
Note that OGC API - Records allows for time and geometry to be encoded as null. This could be used as part of describing any resource without spatial or temporal properties, while keeping broad interoperability given use of OGC API - Records and GeoJSON. |
Thanks. @kalxas , hopefully EOEPCA+ Catalog (or pycsw) will ingest in this format? I think I tried this before with STAC and I could not ingest. Hopefully this will not be an issue for OGC API Records with the latest version of pycsw. |
@GarinSmith pycsw can ingest both OGC API Record and STAC, it has been demonstrated in various EOEPCA demos. We need to define/extend the record to describe the workflows |
@kalxas , great thanks. That is very good to know. Can we start by "defining" and using the current spec, so we can flexibly reference the various different workflow types that can be described externally. This seems like a separation of concerns we need. We also need to try and start off by using what we already have if possible. OpenEO links": [ OGC API Processes links": [ Python Processes links": [ Jupyter Notebook Processes links": [ |
IANA defines the link relations: Also see https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/rel which mentions: The current registries for the possible values of the rel attribute are the IANA link relation registry, the HTML Living Standard, and the freely-editable existing-rel-values page in the microformats wiki, as suggested by the Living Standard. If a rel attribute not present in one of the three sources above is used some HTML validators (such as the W3C Markup Validation Service) will generate a warning. |
My plan is to draft some initial proposal for the next demo. |
Thanks @kalxas, I saw a reference to IANA before, but could not find the links above. My current thoughts are that OGC API Records seems to provide most of what we currently seem to need. However
I note the IANA links above do not seem to be interested in things like Workflows or Processes or Process Types. However, this is important to us, because we need to know the type of link we are looking at, so that we know better what platform can handle that type of link. I note that the examples above do successfully validate when I use the check-jsonschema tool. They also correspond with the approach some platforms already use, so they are a useful starting point to move forwards from. |
I have created a new repository that will host the metadata schema for EOEPCA profile(s): The resource schema was initialized with the OGC API Records schema: An enumeration is provided for the resource type which can be further expanded to support various types (as required above). From that initial resource definition, I have created a JSON Schema bundle as described in WMO by @tomkralidis Validation process described here: |
Thanks @kalxas, It might help to test this against a typical scenario that EarthCODE might want to use. Could you please update it to include a typical "EOEPCA resource type" for instance a "workflow" However, how will we know what type of workflow we are dealing with (openeo-process in this case)? Does the type of "workflow" have to go at the link level if there is more than one link? Many thanks Garin |
Hi @kalxas and @rconway, Angelos can you please tweak worldcereal_inference2.json above, so that it validates against your latest schema? It would help if there was clear meaning to the following EOEPCA resource types that map to the EarthCODE utilisation domain. Note that EarthCODE has the concept of Workflow, Experiment, Application and Product. We need to map to these somehow, hence my comment above. |
Thank you @GarinSmith |
The is a bug in the schema provided, will work to fix it |
Schema updated: |
|
@GarinSmith this is the example that validates: |
Hi @kalxas , |
No description provided.
The text was updated successfully, but these errors were encountered: