Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this the right approach ? #73

Closed
JBBianchi opened this issue Apr 28, 2021 · 17 comments · Fixed by #82
Closed

Is this the right approach ? #73

JBBianchi opened this issue Apr 28, 2021 · 17 comments · Fixed by #82
Labels
question Further information is requested

Comments

@JBBianchi
Copy link
Member

What is the question:
The current state of the project required a lot of work from @antmendoza already and still needs a lot of efforts to cover the whole specification. The result is (and will be) nice but the maintenance will probably be hard and tricky when modifications are made to the specification as the SDK will need to be modified manually, with no real way to compare the specs and the SDK aside by running tests.

An alternative solution would be to generate most of the code based on the JSON Schema and provide a generic way to build/validate objects. The validation can probably be achieved with ajv but the generation is a trickier. In the current state of my researches, I found two candidate tools that aim to generate TS from JSON Schema, json-schema-to-typescript and quicktype. Unfortunatly, none produce a satisfying result and both are in a kind of unmaintained, making them not very reliable.

Than being said, I think there is two possible ways, 1. keep on going with what we have, it requires work but it's predicatably going to produce something satisfying, the problem being the maintenance imo, 2. try to work on the generation of TS, but it might lead nowhere if we fail to produce a satisfying result and probably is more complexe but would probably be more sustainable on the long run.

(this is a follow-up on the digression of #69 (comment) )

@tsurdilo
Copy link
Contributor

With the java sdk we took a hybrid approach, generation + custom serializers/deserializers to deal with complex parts of the schema.

As far the maintenance part, yeah there will be changes until we get to 1.0RC1 (scheduled end of year atm) at which point the amount of changes to schemas should drastically reduce until we reach 1.0.

@antmendoza
Copy link
Member

Hi,

as far I have seen (and you mention @JBBianchi ), the actual code generated by libraries like json-schema-to-typescript is far from what we need.

If we continue with the actual approach, I would said that in a couple of weeks we could have most of the specification covered.

I don't want to close the door to try any other approach, now or in the future. I really would like to have the code autogenerated.. it is a pain actually.
Even if we managed to generate the code based on the JSON Schema, the API (the actual builders) should not suffer remarkable changes... as I am trying to stick to the names as they are in the schema.

Coding and testing the builders is the most time consuming, and this is something we will still have to do. Do you know a better way to do it? I have not found any decent library for typescript (except builder-pattern). Something like lombok (for java) or the plugging been using in the skd-java would be a plus.

@tsurdilo
Copy link
Contributor

is there anything from the spec side that we can do to help with issue of having to do large changes? Once we get close to 1.0 spec release the dsl should stabilize and things should not have to change much . Up until then changes will happen :(

@JBBianchi
Copy link
Member Author

I did some more experiements and research in the last few days.

I tried first to dig into quicktype to see if it would be possible to patch it so it handles union types instead of merging the objects classes but I just pulled my hair in the intersection/union loop and lost my time. (No wonder the issue is known since the beginning of the project but has never been solved)

Then I stumbled upon dtsgenerator. It doesn't support external $ref as far as I understood but, by merging the documents manually, I produced something that might go in the right direction, here is what the result looks like: https://paste.serveur.io/RmDDhYhr.ts

From that point on, I'll try to find a way to merge the documents into one schema even if I'm not sure how to yet. I have difficulties to understand "referenced" documents by themselves, even if they are valid JSON Schema. For instance functions.json, at the top level, describes an object (L4, "type": "object") but there is no "properties" key. Instead, there is a top level "functions" key but I don't really understand how it's supposed to be "understood" on its own. If I put it in perspective with workflow.json, it makes more sense as the $ref points directly to functions.json#/functions, so it's explicit. In the ends, it probably means having to properly parse the references which is not trivial. Maybe it's possible to take advantage of json-schema-ref-parser (used by json-schema-to-typescript) but I'm pretty sure it's not gonna be easy as it's an open issue of dtsgenerator.
The next step would probably be to write plugin for dtsgenerator to clean/simplify/modify the output structure, not sure it's possible to really achieve 100% flexibility but there is probably a way to change it a bit.

That brings me to two questions:

  • Why are some objects out of workflow.json, like functions, events, retries, but the majority are embedded in workflow.json#/definitions?
  • Would it make sense to "avoid" using the top level and rather use definitions instead, with a $ref entry point if necessary ? e.g.:
{
  "$id": "https://serverlessworkflow.org/core/workflow.json",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "description": "Serverless Workflow specification - workflow schema",
  "$ref": "#/definitions/workflow",
  "definitions": {
    "workflow": {
      "type": "object",
      "properties": {
        "id": {
          "type": "string",
          "description": "Workflow unique identifier",
          "minLength": 1
        },
        "key": {
          "type": "string",
          "description": "Domain-specific workflow identifier",
          "minLength": 1
        },
        "name": {
          "type": "string",
          "description": "Workflow name",
          "minLength": 1
        },
...

I'm really out of my depth here, I'm not very familiar with the json schema spec so sorry if I don't make much sense or my questions/remarks are not very pertinent.

@tsurdilo
Copy link
Contributor

@JBBianchi
Regarding: Why are some objects out of workflow.json, like functions, events, retries, but the majority are embedded in workflow.json#/definitions?

function, event and retry definition can be defined in separate json/yaml files outside of the main workflow definition. Within the workflow def they can be referenced by URI, for example:

"functions": "myfunctionsdef.json"

this is the reason those are in separate, individual schemas, so users can validate them on their own.

@tsurdilo
Copy link
Contributor

tsurdilo commented Apr 30, 2021

Would it make sense to "avoid" using the top level and rather use definitions instead, with a $ref entry point if necessary ? e.g.: - I think that would be possible. I can try and test if that changes the def in any way

@JBBianchi
Copy link
Member Author

@JBBianchi
Regarding: Why are some objects out of workflow.json, like functions, events, retries, but the majority are embedded in workflow.json#/definitions?

function, event and retry definition can be defined in separate json/yaml files outside of the main workflow definition. Within the workflow def they can be referenced by URI, for example:

"functions": "myfunctionsdef.json"

this is the reason those are in separate, individual schemas, so users can validate them on their own.

Thanks for your input. I still wonder why "functions/events/retries" are external where "delaystate/eventstate/operationstate/..." are internal. What is the motivation for some of them to be internal and some external ? I'm not sure but I guess a user could validate a "delaystate" even if it's not inside its own file/schema but in a "shared" one? Sorry if those questions are a bit stupid.

@tsurdilo
Copy link
Contributor

tsurdilo commented Apr 30, 2021

@JBBianchi sorry maybe i dont understand the question. the reason we did it like this (and not saying that we couldnt change if there is a valid reason for it) is that since event / retry /function definitions can be referenced as uri (so they can be defined in a separate file and reused between multiple workflow defs) you might have a situation where one team develops the workflows and another team writes the event and function definitions. in that scenario the team that is responsible for event / function defs can validate them without having to use the workflow schema.

@tsurdilo
Copy link
Contributor

i think the issue really is more that most json schema generation tools are unable to deal with more complex schemas, and not our schema itself :)

@JBBianchi
Copy link
Member Author

@tsurdilo indeed, I'm getting sidetracked here, sry about that.

@antmendoza
Copy link
Member

antmendoza commented May 1, 2021

Hi @JBBianchi this is a huge step 👏 👏

I have tried to accommodate your code to the actual implementation and works like a charm.. ;) https://github.com/antmendoza/sdk-typescript/tree/JBBianchi-code

More than happy to have your code checked in when you think that it is the right time.

@JBBianchi
Copy link
Member Author

Thanks @antmendoza

I spent some more time working on the alternative approach I had in mind and came up with something I am 'kind of' satisfied with:
https://github.com/neuroglia-io/sdk-typescript/tree/alternative-approach
(Important: run npm run update-code-base after npm install)

This fork have the following differences:

Still pending:

  • Allow browser usage #65 needs to be checked. I think that, in the current state, the browser usage is not supported because of the JSON imports in the validators. It might be solved by bundlers.

Moderate severity issues:

  • Migrate all existing tests
  • The types are generated as type or interface so the "sub types" of state (delay state, event state, operation state, ...) type property is not defined by default and needs to be defined by the user, e.g.:
injectstateBuilder()
  .type("inject") // <--- should be implied
  .name("Hello State")
  .data({
    "result": "Hello World!"
  })
  .end(true)
  .build()
  • The type name of workflow is generated as WorkflowJson which is 1. misleading, 2. ugly + there is both a type and a namespace with the same name... 🤢
  • CJS/UMD/ESM / bundling

Low severity issues:

  • The schema properties name case leads to "ugly names" (e.g.: delaystate results in the type Delaystate and builder file delaystate-builder.ts where it could be DelayState & delay-state-builder.ts), but that's something to solve at the schema level.
  • The validators type->validation function map is manually maintained. With some efforts, it could be generated instead.
  • The current validators doesn't respect the initial design of the WorkflowValidator but it could easily be done if we want to

Please, have a look and give me your feedback, I'd be grateful.

@tsurdilo tsurdilo added the question Further information is requested label May 3, 2021
@antmendoza
Copy link
Member

antmendoza commented May 3, 2021

Hi @JBBianchi I have no words!!! 👏 👏 👏

I've been trying something similar (more or less.. ) for a month with no results... a lot to learn form my side here!

Let me check it, but I don't think I will come with any better idea or anything to say...

Thank you!!

@antmendoza
Copy link
Member

antmendoza commented May 3, 2021

@JBBianchi

  • /spec became /tests

👍

I still do not understand why default registry should be added, it is not that I disagree.

and for the engines, is there any issue working with the node v14 or lower ?

Still pending:

  • Allow browser usage #65 needs to be checked. I think that, in the current state, the browser usage is not supported because of the JSON imports in the validators. It might be solved by bundlers.

Ok, I have only experience on the server side. Let me know if is there anything I can do.

Moderate severity issues:

  • Migrate all existing tests

No problem, I will do it, do not delete the existing code in the PR yet, please.

  • The types are generated as type or interface so the "sub types" of state (delay state, event state, operation state, >...) type property is not defined by default and needs to be defined by the user, e.g.:
injectstateBuilder()
 .type("inject") // <--- should be implied
 .name("Hello State")
 .data({
   "result": "Hello World!"
 })
 .end(true)
 .build()

You already have the solution, pretty sure ;) . Just to try to contribute with something, could be done by adding to the builder initialisation the initial state of the object.

We could end up with something like:

  const state = {
    type: "inject"
  };
  return builder<Injectstate>(injectstateValidator, state);

where state stored (by hand for now) and taken from some file by your generator.

What do you think?

  • The type name of workflow is generated as WorkflowJson which is 1. misleading, 2. ugly + there is both a type and a namespace with the same name... 🤢

👍

  • CJS/UMD/ESM / bundling

👍

Low severity issues:

  • The schema properties name case leads to "ugly names" (e.g.: delaystate results in the type Delaystate and builder file delaystate-builder.ts where it could be DelayState & delay-state-builder.ts), but that's something to solve at the schema level.

no problem, I think that the java-sdk is doing it the same way.

👍

  • The current validators doesn't respect the initial design of the WorkflowValidator but it could easily be done if we want to

creating a wrapper is an option? we have to add also the integrity validations for the workflow object.

And again, great work!! 😜 🥳

@JBBianchi
Copy link
Member Author

JBBianchi commented May 4, 2021

@antmendoza Thanks for your support and feedback

I still do not understand why default registry should be added, it is not that I disagree.

and for the engines, is there any issue working with the node v14 or lower ?

It's a bit useless tbh but it prevents problem like the one I had with my PR because I use a private proxy by default. This allows to be sure to use the default npm registry for that project.

Afaik (but I might be wrong), npm 7 is shipped by default with node 15, thus the limitation. In the end, I think the node version could be lower, it's kind of an arbitrary choice. Npm 7 though brings breaking changes on the package-lock.json so we need to enforce its usage. In the end, as I didn't add engine-strict, this is just informative for now (aka useless too in a way).

Ok, I have only experience on the server side. Let me know if is there anything I can do.

It's ok, I'll take care of this.

No problem, I will do it, do not delete the existing code in the PR yet, please.

I'll add them as spec.old.ts or something alike.

You already have the solution, pretty sure ;) . Just to try to contribute with something, could be done by adding to the builder initialisation the initial state of the object.

We could end up with something like:

  const state = {
    type: "inject"
  };
  return builder<Injectstate>(injectstateValidator, state);

where state stored (by hand for now) and taken from some file by your generator.

What do you think?

I was thinking to set the property in the validator function so it's the last thing done before outputting the object. I'm thinking about adding "manual extensions" in the builder generator so I can add custom code before the validation for instance. It's not the most "automated" but as from now, I don't see any alternative to do better.

creating a wrapper is an option? we have to add also the integrity validations for the workflow object.

Yes, that's what I was thinking about, just a 'simple' class that holds a reference to the workflow specific validation function. I'll add this.

@antmendoza
Copy link
Member

antmendoza commented May 4, 2021

thanks @JBBianchi,

you don't have to do it at one.

As you may already know, the kubecon is happening and we have been asked to create issue to (potentially) have people collaborating.

I was thinking if some of the things that have still to be done in this side could be opened as issue, so we can label it as "good first issue". This is entirely up to you, just let me know if you see it appropriate.

thank you!

@JBBianchi
Copy link
Member Author

@antmendoza sure, any contributor is always welcome to help!

Here are the issues left with the latest version:

The other issues should be resolved

And of course, code reviews, improvements, suggestions, more tests, etc. it's never finished :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants