Skip to content

Commit

Permalink
Merge pull request #580 from crim-ca/echo-process+support-uom
Browse files Browse the repository at this point in the history
  • Loading branch information
fmigneault authored Dec 13, 2023
2 parents 74c714f + 60d1c08 commit f30313f
Show file tree
Hide file tree
Showing 49 changed files with 4,005 additions and 762 deletions.
10 changes: 6 additions & 4 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -380,9 +380,11 @@ class-attribute-naming-style=any
# Naming style matching correct class names.
class-naming-style=PascalCase

# Regular expression matching correct class names. Overrides class-naming-
# style.
#class-rgx=
# Regular expression matching correct class names. Overrides class-naming-style.
# Allow typing definitions that are matched as 'classes' to have slightly more versatile names.
class-rgx=((_{0,2}[A-Z][a-zA-Z0-9]+)|((CWL|PKG|WPS|OAS|JSON|IO|ANY)_[a-zA-Z0-9_]+Types?))$

typealias-rgx=_{0,2}(?!T[A-Z]|Type)((CWL|PKG|WPS|OAS|JSON|IO|ANY)_)?[A-Z]+[a-z0-9]+(?:[A-Z][a-z0-9]+)*$

# Naming style matching correct constant names.
const-naming-style=UPPER_CASE
Expand All @@ -406,7 +408,7 @@ function-naming-style=snake_case
good-names=i,j,k,v,kv,ex,x,y,z,f,h,db,kw,dt,q,ns,id,s3,to,_

# Include a hint for the correct naming format with invalid-name.
include-naming-hint=no
include-naming-hint=yes

# Naming style matching correct inline iteration names.
inlinevar-naming-style=any
Expand Down
59 changes: 52 additions & 7 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,53 @@ Changes

Changes:
--------
- No change.

Fixes:
------
- No change.
- Add ``weaver.formats.ContentEncoding`` with handlers for common encoding manipulation from input values.
- Add |oap_echo|_ to the list of ``weaver.processes.builtin`` definitions with its `CWL` representation and
complementary `OGC API - Processes` reference implementation details. This `Process` will be automatically deployed
at `API` startup, and is employed to validate multiple parsing combinations of execution I/O values and encodings
(fixes `#379 <https://github.com/crim-ca/weaver/issues/379>`_).
- Add support of `OGC` `BoundingBox` definition (``bbox`` and ``crs`` fields) as `Process` execution input value
with appropriate schema validation (fixes `#51 <https://github.com/crim-ca/weaver/issues/51>`_).
- Add support of `Unit of Measure` (`UoM`) definition (``measurement`` and ``uom`` fields) as `Process` execution
input value with appropriate schema validation (fixes `#430 <https://github.com/crim-ca/weaver/issues/430>`_).
- Add ``create_metalink`` utility function to facilitate generation of a ``.meta4`` or ``.metalink`` file definition
from a list of file link references (relates to `#25 <https://github.com/crim-ca/weaver/issues/25>`_).

Fixes:
------
- Fix ``weaver.wps_restapi.swagger_definitions.ExecuteInputValues`` deserialization that sometimes silently dropped
invalid `JSON`-formatted inputs that did not fulfill schema validation. This was caused by a side effect regarding
how ``weaver.wps_restapi.colander_extras.VariableSchemaNode`` handled "unknown" `JSON` ``properties`` from submitted
content. In cases where *required* `Process` inputs were causing the invalid schema, `Job` execution would be aborted
and the error would be reported due to "missing" inputs. However, if the `JSON` failing schema validation happened to
be nested under an *optional* input definition, the `Job` execution could have resumed silently by omitting this
input's value propagation to the downstream `CWL`, `WPS` or `OGC API - Processes` implementation, which could make
it use an alternative default value than the real input that was submitted for the `Job`.
- Fix schema name representation employed in generated ``colander.Invalid`` error when a schema validation failed, in
order to better represent deeply nested schema using multiple ``oneOf``, ``anyOf``, ``allOf`` schema nodes.
Using ``colander.Invalid.asdict``, each dictionary key now properly indicates the specific path of sub-nodes with
their relevant schema validation error.
- Fix ``variable`` schema node names to provide a ``{SchemaName}<{VariableName}>`` representation, such that it can be
more easily identified. Schema nodes with a ``variable`` (i.e.: schema under ``additionalProperties``) previously only
indicated ``{VariableName}``, which made it complicated to follow reference schema classes that formed the error path.
Each of the evaluated fields against each possible ``variable`` schema will now report their corresponding nested
schema validation error as ``{SchemaName}<{VariableName}>({field})`` such that results can be understood.
- Fix execution input reference (i.e.: using ``href``) dropping a ``schema`` URL reference if provided explicitly.
This parameter now remains within the produced content passed to the `Job`, and forwarded to a remote `Process` if
applicable, but no further schema validation is accomplished with the value in ``schema`` for the moment.
- Fix ``ContentType.IMAGE_OGC_GEOTIFF`` using invalid media-type name (missing ``i`` in ``image``).
- Fix `Job` input validation stripping additional parameters from provided Media-Type, potentially causing mismatching
Content-Type validation against the corresponding `Process` description inputs. Types should now match exactly the
original `Process` definition, including any additional parameters and sub-types.
- Fix resolution of ``anyOf`` schema raising ``colander.Invalid`` even when the property was marked as optional
using ``missing=colander.drop``.
- Fix ``$schema`` of `OGC` ``nameReferenceType`` being reported under every ``dataType`` of ``literalDataDomains`` for
literal `I/O` of `Process` descriptions. The reference is not only included in the `OpenAPI` definition as intended.
- Fix override of `CWL` ``stderr`` and ``stdout`` definitions if specified by the original *Application Package* for
its own implementation. These stream handles are added to the `CWL` by Weaver to provide more contextual debugging
and traceability details of the internal application executed by the `Process`. However, a package making use of this
functionality of `CWL` to capture an output file would be broken unless naming the file exactly as ``stderr.log`` and
``stdout.log``. Weaver will now employ the parameters provided by the *Application Package* if specified.

.. _changes_4.38.0:

Expand Down Expand Up @@ -83,6 +125,9 @@ Changes:
- Add more tests to validate core code paths of ``builtin`` `Process` ``jsonarray2netcdf``, ``metalink2netcdf`` and
``file_index_selector`` with validation of happy path and error handling conditions.

.. _oap_echo: https://schemas.opengis.net/ogcapi/processes/part1/1.0/examples/json/ProcessDescription.json
.. |oap_echo| replace:: ``EchoProcess``

Fixes:
------
- Fix invalid parsing of `XML` Metalink files in ``metalink2netcdf``. Metalink V3 and V4 will now properly consider the
Expand Down Expand Up @@ -1889,8 +1934,8 @@ Changes:
where potentially inaccessible (according to settings). Definition of `CWL` package will need to add
`InitialWorkDirRequirement <https://www.commonwl.org/v1.0/CommandLineTool.html#InitialWorkDirRequirement>`_ as per
defined by reference specification to stage those files if they need to be accessed with write permissions
(see: `example <https://www.commonwl.org/user_guide/15-staging/>`_). Addresses some issues listed in
`#155 <https://github.com/crim-ca/weaver/issues/155>`_.
(see: `example <https://www.commonwl.org/user_guide/topics/staging-input-files.html>`_).
Addresses some issues listed in `#155 <https://github.com/crim-ca/weaver/issues/155>`_.
- Enforce removal of some invalid `CWL` hints/requirements that would break the behaviour offered by ``Weaver``.
- Use ``weaver.request_options`` for `WPS GetCapabilities` and `WPS Check Status` requests under the running job.
- Change default ``DOCKER_REPO`` value defined in ``Makefile`` to point to reference mentioned in ``README.md`` and
Expand Down
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ install-npm: ## install npm package manager and dependencies if they cannot b
install-npm-stylelint: install-npm ## install stylelint dependency for 'check-css' target using npm
@[ `npm ls 2>/dev/null | grep stylelint-config-standard | wc -l` = 1 ] || ( \
echo "Install required dependencies for CSS checks." && \
npm install stylelint stylelint-config-standard --save-dev \
npm install "stylelint@<16" "stylelint-config-standard@<35" --save-dev \
)

.PHONY: install-npm-remarklint
Expand Down Expand Up @@ -850,3 +850,7 @@ stop: ## kill application instance(s) started with gunicorn (pserve)
.PHONY: stat
stat: ## display processes with PID(s) of gunicorn (pserve) instance(s) running the application
@lsof -i :4001 || echo "No instance running"

# Reapply config if overrides were defined.
# Ensure overrides take precedence over targets and auto-resolution logic of variables.
-include Makefile.config
25 changes: 23 additions & 2 deletions docs/source/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Glossary
API
| Application Programming Interface
| Most typically, referring to the use of HTTP requests following an :term:`OpenAPI` specification.
| Most typically, referring to the use of HTTP(S) requests following an :term:`OpenAPI` specification.
Application Package
General term that refers to *"what and how to execute"* the :term:`Process`. Application Packages provide the
Expand Down Expand Up @@ -103,7 +103,8 @@ Glossary
Alternative operation modes are described in :ref:`Configuration Settings`.
I/O
Inputs and/or Outputs of CWL and/or WPS depending on context.
Inputs and/or Outputs of :term:`CWL`, :term:`OAP`, :term:`WPS` or :term:`OAS` representation
depending on context.

IANA
Ontology that regroups multiple definitions, amongst which `Weaver` looks up most of its known and supported
Expand Down Expand Up @@ -142,6 +143,7 @@ Glossary
OGC
|ogc|_

OAP
OGC API - Processes
The new :term:`API` that defines :term:`JSON` REST-binding representation
of :term:`WPS` :term:`Process` collection.
Expand Down Expand Up @@ -205,6 +207,25 @@ Glossary
transferred to the :term:`Process` as specified, and it is up to the underlying :term:`Application Package`
definition to interpret it as deemed fit.
URI
| Uniform Resource Identifier
| Superset of :term:`URL` and :term:`URN` that uses a specific string format to identify a resource.
URL
| Uniform Resource Locator
| Subset of :term:`URI` that follows the ``<scheme>://<scheme-specific-part>`` format, as per :rfc:`1738`.
Specifies where an identified resource is available and the protocol mechanism employed for retrieving it.
This is employed in `Weaver` for ``http(s)://``, ``s3://`` and ``file://`` locations by :term:`I/O`, or in
general to refer to :term:`API` locations.
URN
| Uniform Resource Name
| Subset of :term:`URI` that follows the ``urn:<namespace>:<specific-part>`` format, as per :rfc:`8141`.
It is used to register a unique reference to a named entity such as a :term:`UoM` or other common definitions.
.. seealso::
- `IANA URN Namespaces <https://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml>`_

Vault
Secured storage employed to upload files that should be temporarily stored on the `Weaver` server for
later retrieval using an access token.
Expand Down
6 changes: 3 additions & 3 deletions docs/source/package.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1174,7 +1174,7 @@ Below is a list of compatible elements.
| Parameters in :term:`WPS` Context | Parameters in :term:`CWL` Context |
+=========================================+==========================================================+
| ``keywords`` | ``s:keywords`` (expecting ``s`` in ``$namespace`` |
| | referring to http://schema.org [#schemaorg]_) |
| | referring to http://schema.org [#cwl_schemaorg]_) |
+-----------------------------------------+----------------------------------------------------------+
| ``metadata`` | ``$schemas``/``$namespace`` |
| (using ``title`` and ``href`` fields) | (using namespace name and HTTP references) |
Expand All @@ -1186,8 +1186,8 @@ Below is a list of compatible elements.

.. rubric:: Footnotes

.. [#schemaorg]
See example: https://www.commonwl.org/user_guide/17-metadata/index.html
.. [#cwl_schemaorg]
See example: `cwl-metadata`_
.. |br| raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
.. _cwl-workflow: https://www.commonwl.org/v1.1/Workflow.html
.. |cwl-workdir-req| replace:: InitialWorkDirRequirement
.. _cwl-workdir-req: https://www.commonwl.org/v1.1/CommandLineTool.html#InitialWorkDirRequirement
.. _cwl-workdir-ex: https://www.commonwl.org/user_guide/15-staging/
.. _cwl-workdir-ex: https://www.commonwl.org/user_guide/topics/staging-input-files.html
.. |cwl-docker-req| replace:: DockerRequirement
.. _cwl-docker-req: https://www.commonwl.org/v1.1/CommandLineTool.html#DockerRequirement
.. FIXME apply official CWL specification location
Expand All @@ -49,7 +49,7 @@
.. _cwl-io-map: https://www.commonwl.org/v1.1/CommandLineTool.html#map
.. |cwl-io-type| replace:: CWLType Symbols
.. _cwl-io-type: https://www.commonwl.org/v1.1/CommandLineTool.html#CWLType
.. _cwl-metadata: https://www.commonwl.org/user_guide/17-metadata/index.html
.. _cwl-metadata: https://www.commonwl.org/user_guide/topics/metadata-and-authorship.html
.. _docker: https://docs.docker.com/develop/
.. |docker| replace:: Docker
.. |ems| replace:: Execution Management Service
Expand Down
8 changes: 5 additions & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ geotiff>=0.2.8
# use pserve to continue supporting config.ini with paste settings
gunicorn>=20.0.4
# reduced dependencies contrains to let packages update to latest (https://github.com/vinitkumar/json2xml/issues/157)
json2xml>=3.20.0
# even more reduced dependency constraints (https://github.com/vinitkumar/json2xml/pull/195)
json2xml>=4.1.0
jsonschema>=3.0.1
# FIXME: kombu for pymongo>=4 not yet released as 5.3.0 (only pre-releases available)
# - https://github.com/crim-ca/weaver/issues/386
Expand All @@ -60,8 +61,9 @@ mypy_boto3_s3
numpy>=1.22.2
# esgf-compute-api (cwt) needs oauthlib but doesn't add it in their requirements
oauthlib
owslib==0.28.1
owslib==0.29.3
PasteDeploy>=3.1.0; python_version >= "3.12"
pint
psutil
# FIXME: pymongo>=4 breaks with kombu corresponding to pinned Celery
# - https://github.com/crim-ca/weaver/issues/386
Expand All @@ -78,7 +80,7 @@ python-dateutil
pyramid_rewrite
pyramid_storage
pytz
pywps==4.5.2
pywps==4.6.0
pyyaml>=5.2
rdflib>=5 # pyup: ignore
requests
Expand Down
43 changes: 43 additions & 0 deletions tests/functional/application-packages/EchoBoundingBox/deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# YAML representation supported by WeaverClient
processDescription:
process:
id: EchoBoundingBox
title: Test Echo Bounding Box
version: "1.0" # must be string, avoid interpretation as float
description: Dummy process that simply echo's back the input bbox for testing purposes.
keywords:
- test
inputs:
bboxInput:
schema:
type: object
format: ogc-bbox
required: ["bbox"]
properties:
bbox:
type: array
items:
type: number
crs:
type: string
outputs:
bboxOutput:
schema:
type: object
format: ogc-bbox
required: ["bbox"]
properties:
bbox:
type: array
items:
type: number
crs:
type: string
jobControlOptions:
- async-execute
outputTransmission:
- reference
executionUnit:
# note: This does not work by itself! The test suite injects the file dynamically.
- href: "tests/functional/application-packages/EchoBoundingBox/echo_bbox.cwl"
deploymentProfileName: "http://www.opengis.net/profiles/eoc/dockerizedApplication"
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
cwlVersion: "v1.0"
class: CommandLineTool
baseCommand: cat
requirements:
DockerRequirement:
dockerPull: "debian:stretch-slim"
inputs:
bboxInput:
# for CWL, bbox is simply a JSON file!
type: File
format: "iana:application/json"
inputBinding:
position: 1
outputs:
bboxOutput:
# for CWL, bbox is simply a JSON file!
type: File
format: "iana:application/json"
outputBinding:
glob: "bbox.json"
stdout: "bbox.json"
$namespaces:
iana: "https://www.iana.org/assignments/media-types/"
11 changes: 11 additions & 0 deletions tests/functional/application-packages/EchoBoundingBox/execute.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
bboxInput:
# note: values can be integer, but use float here to make testing easier (values are auto-converted)
bbox: [1., 2., 3., 4.]
# note:
# OGC-API BBOX definition is strict about needing an URI.
# A simple namespaced CRS (eg: "urn:ogc:def:crs:EPSG::4326" or "EPSG:4326") is not sufficient.
# Definitions can be loosened when using a nested qualified value, since bbox essentially becomes a generic object.
# see:
# - https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/schemas/bbox.yaml
# - https://github.com/opengeospatial/ogcapi-processes/blob/master/openapi/schemas/processes-core/bbox.yaml
crs: "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ inputs:
allOf:
- format: ogc-bbox
# - $ref: ../../openapi/schemas/bbox.yaml
- $ref: https://raw.githubusercontent.com/opengeospatial/ogcapi-processes/d52579/core/openapi/schemas/bbox.yaml
- $ref: https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/schemas/bbox.yaml
title: Bounding Box Input Example
complexObjectInput:
description: This is an example of a complex object input.
Expand Down Expand Up @@ -156,7 +156,7 @@ outputs:
allOf:
- format: ogc-bbox
# - $ref: ../../openapi/schemas/bbox.yaml
- $ref: https://raw.githubusercontent.com/opengeospatial/ogcapi-processes/d52579/core/openapi/schemas/bbox.yaml
- $ref: https://schemas.opengis.net/ogcapi/processes/part1/1.0/openapi/schemas/bbox.yaml
complexObjectOutput:
schema:
properties:
Expand Down
Loading

0 comments on commit f30313f

Please sign in to comment.