Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect executables with targets #591

Open
straight-shoota opened this issue Jul 15, 2023 · 14 comments
Open

Connect executables with targets #591

straight-shoota opened this issue Jul 15, 2023 · 14 comments

Comments

@straight-shoota
Copy link
Member

straight-shoota commented Jul 15, 2023

shard.yml allows defining executables that are supposed to be installed in the main project's bin/ folder, and it allows defining build targets. But they are currently not related.
If executables need to be built, the only way to do that is run a build command in postinstall. Ideally, that's just shards build.

name: example
version: 0.0.0

targets:
  foo:
    main: src/foo.cr

scripts:
  postinstall: shards build

executables:
- foo

I think it would be a good opportunity to automate that by hooking up executables with the available targets information.

Operation would be quite simple: When shards cannot find a file for a declared executable, but there is a target definition of the same name, it runs shards build $executable_name. Then the executable file should exist.

This works without disrupting existing workflows where the executable is already available in the source or built by a postinstall hook because existing files would be preferred.

An immediate result is simplification of the shard.yml, with the additional benefit of turning imperative instructions into declarations (which helps portability).

So the above shard.yml could be shortened to this:

name: example
version: 0.0.0

targets:
  foo:
    main: src/foo.cr

executables:
- foo

No need to define shards build (or the equivalent using make or whatever) in postinstall.
This probably won't work for all shards, because some have more complex builds. But I figure it should be good for the vast majority of typical development tools.

As a further enhancement, instead of building directly upon installation, the executable could actually be a shim which invokes the build command on demand. This avoids waiting for executables to build when running shards install and only builds them if they are actually used. That causes some delay when used the first time, but that should be acceptable: If you want to use it, you have to wait for the build anyway at some point. But if you don't use it, there's no need to build it!

This proposal is based on some ideas previously mentioned in https://forum.crystal-lang.org/t/shards-postinstall-considered-harmful/3910/14?u=straight-shoota

@Vici37
Copy link

Vici37 commented Jul 16, 2023

I really like this idea of being able to list out executables that map to build targets, and let shards take care of the building rather than implementing a build script of a sort via make (or any other build tool of choice).

As a further enhancement, instead of building directly upon installation, the executable could actually be a shim which invokes the build command on demand. This avoids waiting for executables to build when running shards install and only builds them if they are actually used. That causes some delay when used the first time, but that should be acceptable: If you want to use it, you have to wait for the build anyway at some point. But if you don't use it, there's no need to build it!

This portion I dislike, but maybe I'm not seeing the benefit - why would I want to wait for a tool to build the first time I invoke it, instead of when I run shards install? When I run shards install, it's already a fire-and-forget command and I can revisit that terminal a little later (or at least be confident that there are no inputs required by me). When I run a tool, I expect to provide input / immediate feedback. Forcing me to wait while I'm expecting immediate feedback is a poor developer experience, even if it's only once in awhile.

I can't speak for others, but I only include executables / tools that I actively use in my shards.yml, so this may impact me more than what you yourself experience @straight-shoota.

@Sija
Copy link
Contributor

Sija commented Jul 16, 2023

This portion I dislike, but maybe I'm not seeing the benefit - why would I want to wait for a tool to build the first time I invoke it, instead of when I run shards install?

@Vici37 IIUC The reason is to reduce the shards install run time. Building Ameba for instance is slowing down the process considerably.

@straight-shoota
Copy link
Member Author

@Vici37

I only include executables / tools that I actively use in my shards.yml

Use cases differ. I would argue that I might not need all these tools if I just want to checkout the code of your shard, or maybe to build it and run tests?
Think about an imaginary tool for database migration: If your database doesn't change much, it might only be used once in a while. There's really no need to have it built every time you update the database driver.
I have written about this more extensively in https://forum.crystal-lang.org/t/shards-postinstall-considered-harmful/3910. The gist:

Every time I run shards install for a shard using ameba, ameba builds itself. But that’s only useful if you want to contribute to the shard. Most of the time, I don’t need it. And I certainly didn’t ask for it.
Why do I have to wait for ameba to build? Ameba shouldn’t build itself every time it’s installed.
I picked ameba because it’s popular and has stressed my patience many times now. But many other shards are very similar.

And I'm not suggesting that lazy building should be the only option. If you want to have all executables build on shards install, that should certainly be possible. We can discuss which should be the standard behaviour and if there should maybe be a user option for selecting the behaviour.

@Vici37
Copy link

Vici37 commented Jul 17, 2023

That's fair. Reading your reasons for wanting to pull and build a given shard here, it does look like there's existing shards flags to get the behavior you want, such as --skip-postinstall and --production.

I'm not opposed to creating a new sub-sub command to handle this phase (i.e. shards install tools or some such), to split out the lifecycle of a dependency's executable from the dependency source code. Overall, I'd want to apply the principle of least surprise as the developer experience, and I think an async flow (with or without shims) would lead to more surprises rather than less.

@jgaskins
Copy link

As a further enhancement, instead of building directly upon installation, the executable could actually be a shim which invokes the build command on demand. This avoids waiting for executables to build when running shards install and only builds them if they are actually used. That causes some delay when used the first time, but that should be acceptable: If you want to use it, you have to wait for the build anyway at some point. But if you don't use it, there's no need to build it!

I've built a proof of concept for this into my Interro shard in this commit.

I updated one of my apps to use that branch and this is what happens when I run my bin/setup script which, among other things, calls bin/interro-migration for both my dev and test DBs (output trimmed for brevity):

$ bin/setup
First run of interro-migration, compiling...
Running CreateUsers
CREATE TABLE users(
  id UUID PRIMARY KEY NOT NULL DEFAULT gen_random_uuid(),
  email TEXT UNIQUE NOT NULL,
  name TEXT NOT NULL,
  password TEXT NOT NULL,
  role INT4 NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
)
CreateUsers: 1.24ms
...
Running CreateUsers
CREATE TABLE users(
  id UUID PRIMARY KEY NOT NULL DEFAULT gen_random_uuid(),
  email TEXT UNIQUE NOT NULL,
  name TEXT NOT NULL,
  password TEXT NOT NULL,
  role INT4 NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
)
CreateUsers: 1.17ms
...

Note that when it ran bin/interro-migration for the dev DB, it showed that it was the first run and was compiling it. The second run, for the test DB, did not show that output because the binary was already compiled over the placeholder script.

And this is repeatable — I’m running rm -rf bin/interro-migration lib && shards update interro && dropdb $DEV_DB && dropdb $TEST_DB && bin/setup and it works every time.

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Feb 21, 2025

The current solution (calling shards build) will work on every platform, but this solution won't work on targets without sh for example.

What about asking users to write a 3 lines script that they can then customize, or even embed right into their own executable (as a subcommand for example)?

require "pg"
require "my_migration/cli"
MyMigration::CLI.call(ARGV, ENV["DATABASE_URL"])

@jgaskins
Copy link

The current solution (calling shards build) will work on every platform, but this solution won't work on targets without sh for example.

I didn't call it a solution. I called it a proof of concept. It is not intended to cover every use case as written. A proper solution using this idea would include a pattern that already has precedent in several parts of the Crystal stdlib — we can simply generate code based on which platform is in use.

If a POSIX system doesn't have sh, it might have bash, zsh, fish, or one of a handful of other shells and shards can generate a script for whichever shell it detects. On Windows, it can output a Powershell script or batch file.

What about asking users to write a 3 lines script that they can then customize, or even embed right into their own executable (as a subcommand for example)?

This helps avoid compiling ameba unnecessarily, but it means every shard with a CLI requires manual effort when the shard is installed and every time any of those shards is updated. I use several shards with CLIs (code generators, DB migrations, protobuf plugins, etc), so this becomes a strong net negative really fast. I'd rather sit through an ameba compilation a dozen times than write a CLI to accommodate a shard once.

There are times when writing a CLI can't be avoided (background job processors need the code from your app's background job handlers, for example), but if I have to do it for every CLI I use, I'd be annoyed as hell. I'm autistic and putting that many speed bumps in front of me would make me lose interest in this ecosystem entirely.

I use the CLIs I use for the same reason I want them compiled when I install/update their shards: because automating manual processes is frequently a good thing. The shell script shim will overwrite the existing binary any time the shard is updated, which will then recompile the CLI automatically.

@jgaskins
Copy link

jgaskins commented Feb 22, 2025

if I have to do it for every CLI I use, I'd be annoyed as hell

It occurred to me that, without this context, the magnitude of my objection might seem a bit excessive: I use Crystal primarily for microservices. These services aren't tiny single-function deployments, but I'm not dealing with a monolith, either, due to compilation times. You may think it's not a big deal to write a single CLI and I can sympathize with that perspective (while I don't agree with it, I do understand it), but with 4 shards providing CLIs (protobuf, grpc, interro, and wax) across 13 services, that's 52 CLIs to write. It also means the extraction of new services comes with the overhead of writing 4 more shard CLIs on top of the entrypoints for the service itself.

And while there might be some commonalities that could be used across all these services, I'd still have to rearchitect my services' entrypoints because a handful of people don't want to compile ameba. I don't like that tradeoff at all.

That's why I'm trying to accommodate Johannes's idea of lazily compiled binaries. It took me a while to come around to it but, after trying it out, it seems like a great compromise.

@ysbaddaden
Copy link
Contributor

with 4 shards providing CLIs (protobuf, grpc, interro, and wax) across 13 services, that's 52 CLIs to write

Thanks, this is what we want: use cases!

I'm not saying the following are good ideas. I'm trying to think out of the box: how would we do without postinstall and executables?

What about a monorepo or docker container or nix environment, so you'd only write it once per shard and use everywhere?

You'd only build (or download) 4 executables instead of 13 times the same 4 executables. Adding a service would immediately use that environment, no need to install all these executables again.

Doesn't wax feel more like a global tool to install once globally and seldom update (same as ameba)?

For example we install Crystal and its stdlib once, either globally or through asdf, docker or nix. We could install and build it for every project 🤷

shards can generate a script for whichever shell it detects.

That sounds horrible for Shards to implement and maintain 😨

@luislavena
Copy link
Contributor

Thank you @jgaskins for sharing your use case.

I know what I'm about to ask is weird, but bear with me for a minute:

What if shards never implemented postinstall or executables, simply put: they do not exists, was never thought about it. How you would have solved this challenge?

I know is hard to imagine that scenario since postinstall dates back to 2015, but, for the case of wax which looks to be a code generator:

  • Would you have cloned wax, build it and then copied to your ~/bin or /usr/local/bin directories and called it for your projects?
  • If you were using containers, would you have pre-built it part of the container image to avoid the repeat burden?

What about the other shards that are part of your projects, and the possible CLI they provide? How about these shards depending on each other that also may or may not need the CLI provided for them to work?

I'm honestly interested to hear this, as we are all talking about removing a functionality and the anxiety and fear that generates, without seeing the possibilities of that being accomplished differently.

Thank you for your patience, openness and candor on your comments.
❤ ❤ ❤

@jgaskins
Copy link

What about a monorepo or docker container or nix environment, so you'd only write it once per shard and use everywhere?

You'd only build (or download) 4 executables instead of 13 times the same 4 executables. Adding a service would immediately use that environment, no need to install all these executables again.

This is what I was getting at with "I'd still have to rearchitect my services' entrypoints…".

The shard developers would ideally still document how to include these things in your apps, but (a) I don't imagine that practice would be universal[1], and (b) the ones that do will still end up having to put effort into documentation for that, which can become out of date without the maintainer noticing. That is highly unlikely with postinstall and executables because it would no longer work for the maintainer.

And it still puts responsibility onto the developers of the apps. I would have to come up with some kind of solution for this that I currently do not have to come up with. This is a net negative for me.

Doesn't wax feel more like a global tool to install once globally and seldom update (same as ameba)?

For example we install Crystal and its stdlib once, either globally or through asdf, docker or nix. We could install and build it for every project 🤷

The main reason it's implemented as a shard is that the code it generates depends on armature, interro, and a few other shards. So if you add wax to your dependencies, it all just works.

I've also been considering implementing a plugin architecture for it that works kinda like git or kubectl plugins, where you could install a shard that would extend wax's behavior by installing other executables — like how git upload-pack runs the git-upload-pack executable.

That sounds horrible for Shards to implement and maintain 😨

Is it meaningfully different from how Crystal provides all of these LibC backends and scans multiple locations for TZ databases?


[1]

When there is a knowledge imbalance about a manual process, the people that have intimate knowledge about the thing will assume others do, too. This happens a lot: engineers talking with non-engineers about how a feature works, engineers working on infrastructure talking to engineers who work on application code, etc. In this case, the maintainers would know about the updated API, but others would not.

@straight-shoota
Copy link
Member Author

straight-shoota commented Feb 22, 2025

with 4 shards providing CLIs (protobuf, grpc, interro, and wax) across 13 services, that's 52 CLIs to write.

Could you clarify what you mean with "CLIs to write"?

Doesn't wax feel more like a global tool to install once globally and seldom update (same as ameba)?

For example we install Crystal and its stdlib once, either globally or through asdf, docker or nix. We could install and build it for every project 🤷

The main reason it's implemented as a shard is that the code it generates depends on armature, interro, and a few other shards. So if you add wax to your dependencies, it all just works.

wax is a code generator, so couldn't it just generate code to add the necessary dependencies in shard.yml? Optionally even run shards install to materialize them.

@jgaskins
Copy link

Could you clarify what you mean with "CLIs to write"?

Julien's suggestion here:

What about asking users to write a 3 lines script that they can then customize

The idea doesn't scale for microservice architectures.

wax is a code generator, so couldn't it just generate code to add the necessary dependencies in shard.yml? Optionally even run shards install to materialize them.

Sure, in theory. I could extract other parts of wax not specific to code generation (asset compilation, project-local requires, auto-reloading web server, etc) to one or more shards and have wax generate code to drive them.

In practice, this puts yet another burden on the app developer. If wax is centralized on the machine, as Julien suggests, they have to ensure that they're running a version of it that is compatible with all of the shards they use it to generate code for. Otherwise, it'll generate code that doesn't compile that the developer then has to debug.

It creates a 1:N relationship between the code generator and the number of versions of the dependencies that wax has to care about. This is a maintenance burden I can't take on.

wax is designed as a shard so that it only needs to care about the APIs of the dependencies as they exist at that point in time. It's the same reason the Crystal compiler and stdlib are colocated. Version 1.15.1 of the compiler doesn't need to be concerned with version 1.14.0 of the stdlib.

@jgaskins
Copy link

@luislavena

What if shards never implemented postinstall or executables, simply put: they do not exists, was never thought about it. How you would have solved this challenge?

  • Would you have cloned wax, build it and then copied to your ~/bin or /usr/local/bin directories and called it for your projects?

Funny you mention this. The protobuf shard used to recommend exactly this (and, actually, I just double-checked and the README still recommends it — I thought I'd removed it). I changed it 5 years ago because that workflow felt really clunky and automatically compiling it was a huge improvement.

  • If you were using containers, would you have pre-built it part of the container image to avoid the repeat burden?

I deploy containers and the CLIs only ever have to build when the shards change (new shards added, one or more updated, etc). My apps have COPY shard.* … and RUN shards install cached in base images so my container builds typically start by copying and compiling Crystal code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants