-
Notifications
You must be signed in to change notification settings - Fork 0
Smackage
Smackage exists! See https://github.com/standardml/smackage! If you're a developer the beginning documentation for how to adapt your project to Smackage can be found on the Smackage wiki.
This page will stick around with incomplete/foolish/wrong information for the sake of posterity.
There currently exists no standard way in which to package, discover and install SML libraries. We think this is worth fixing, as this might help SML libraries gain more traction, and help them to be maintained.
If this is something that you might be interested in working on, talk to gianp, rjsimmon or sully on #sml on Freenode.
- Which platforms do we support? SML/NJ and MLton?
- How much do we try to do? Just provide search over an index of packages and a standard way to fetch and unpack them? Or something more sophisticated?
- How does this interact with existing platform-specific packaging systems?
- Are there SML-specific artifacts we could usefully expose (e.g. signatures?)
- Naming hierarchy and signatures?
- Python style?
- Java style?
- --- something else?
Here's a very basic thought or two that doesn't manage to take advantage of SML in any way. If people have experience with other versioning systems, they should chime in. Assuming people used semantic versioning in the correct way as described by http://semver.org/ , it seems like it would incorporate into the repository for "awesometool" a file awesometool.smackspec
, with the following contents.
require cmlib 1.0.2;
require utf8lib 2.1.5;
If we then ran, as part of a makefile process, perhaps, or just independently, a command like this:
$ smack awesometool.smackspec
Smackage would then need to know how to seek out cmlib
and utf8lib
. If it currently had lying around (on a Mac)
/Users/Me/Library/Standardml/smackage/cmlib-1.1.3
/Users/Me/Library/Standardml/smackage/utf8lib-2.1.5
Then it could just symlink cmlib-1.1.3
to cmlib-1.0.2
, since the later should be backwards compatible with the former, but would need to download either utf8lib-2.1.5
(or, maybe, the more recent utf8lib-2.4.9
and create a symlink).
That's all we do: put code into the right place and handle (transitive, versioned) dependencies, counting on the fact that people are using semantic versioning correctly (mainly, this means requiring that version 2.1.5 is required to be backward-compatible with 2.0.3, but 3.0.1 is not). Let's assume that both of these libraries provide the facility to be run with ML basis files and with CM files.
There's a one time configuration where we update the global mlb-path-map file (it lives at /usr/local/lib/mlton/mlb-path-map
on Rob's MacBook) with the line
SMACKAGE /Users/Me/Standardml/smackage
Then, our own sources.mlb
file could include
$(SMACKAGE)/cmlib-1.0.2/cmlib.mlb
$(SMACKAGE)/utf8lib-2.1.5/sources.mlb
There's a one time configuration where we add to our local ~/.smlnj-pathconfig
file the single line
SMACKAGE /Users/Me/Standardml/smackage
Then, our own sources.cm
file could include
$SMACKAGE/cmlib-1.0.2/cmlib.cm
$SMACKAGE/utf8lib-2.1.5/sources.mlb
Three cheers for Sully, who figured out what an smlnj-pathconfig line was.
I (Gian) think it's safe to release something really simple as version 0.0.1. A basic curl wrapper that supports semantic versioning and very basic dependency resolution is a big improvement over what we currently have. We could back it off Git initially, and look at issues like mirroring and other delivery methods once we have anybody actually using it.
A first attempt at a package spec format (this syntax will probably change to something like YAML):
provides <package> <version>;
description "...";
maintainer "...";
keywords: "...";
upstream-version: "..."; (probably a github address)
upstream-url: "..."; (a project/maintainer homepage if we're lucky)
git "...";
svn "...";
hg "...";
cvs "...";
doc-url "..."; (either package relative or a URL).
bug-url "..."; (a link to a bug reporting interface somewhere)
license ...;
require <package> <version>;
platform <SML environment> <version>; (e.g. MLton 20100608, smlnj 110.46)
build <package-relative-path-to-command>;
test <package-relative-path-to-command>;
install <package-relative-path-to-command>;
uninstall <package-relative-path-to-command>;
doc <package-relative-path-to-command>;
Are there other things we should include? Many of these values will probably be unused initially. It's just a way of having some good-quality meta-data for packages from day one. We can figure out better ways to use this data eventually.
Presumably running 'smack' as root should install packages globally, whereas running as a normal user will install somewhere in the user's home directory?
- rjsimmon - This seems plausible. Something I'd point out is that there's a "simplest possible" version of this that doesn't even understand that it's about standard ml - it's just about getting library code and can't be used for getting applications. This would take away "platform" "build" "test" "install" "uninstall" and maybe even "doc", and could be a simpler core ("stuffage," perhaps) upon which smackage could fork but which could be adapted to other uses.
- gianp - I agree in theory. I originally just had pre- and post-install hooks, but the idea to separate them out was to make it more compatible with things like deb packages. "platform" is there to make it easy to distinguish between a (for example) GTK+ binding that only works with MLton, versus one that works for SML/NJ, or one that works for both. It's to permit me to search for a library that works with my chosen development platform, so I think that's still needed in either case. "doc" might be just running 'make doc' to generate a texinfo manual. Or it might well do nothing in version 0.0.1 :)
<rjsimmon> Also, there's a problem in that if we don't specify a *specific* version, how will the user's .mlb file know what path to use to look for the correctly-versioned thing?
<gianp> rjsimmon, Yep, I thought that would be fine. I just implemented the semantic versioning spec, we don't need to use it all :)
<rjsimmon> oh, okay :-)
<gianp> rjsimmon, I was thinking we symlink .../libname/current to .../libname/1.5.1/
<rjsimmon> then you run into a haskell problem:
<gianp> Then you either include libname-1.5.1.{mlb,cm} or libname-current
<rjsimmon> what if I need package A and B, but A needs C version 1.0-1.5 and B needs C version 1.6-1.9
<rjsimmon> they both hardlink to "C/current"
<rjsimmon> but that can only be one thing.
<gianp> Hrm, you're right.
<rjsimmon> or "C-current", I guess, to follow your example
<gianp> So what about having smackage generate cm/mlb stanzas against concrete versions instead?
<rjsimmon> Possible. However, I want to see if we can *not* generate code as a simplicity measure.
<gianp> So when I build foo, it requires 1.0-1.5, so I just grab any version that satisfies that and return it. If I just require >= 1.5, then I'll get whatever the latest installed version is right now.
<gianp> Well, it's not so much generating code as much as it is outputting lists of installed packages.
<gianp> but I agree in principle.
<rjsimmon> like, I became very enamored with the semantic versioning principle that you maintain interface compatibility within a major release
<gianp> Yes, agreed.
<rjsimmon> that means that the "signature" for 1.2.1 can be satisfied by any "structure" 2.0.0, and so I can just hard-refer to 1.2.1.
<rjsimmon> packages that depend on a "2.x.x" and a "1.x.x" release of the same package (transitively) will have to make sure that the two versions can never see each other, but CM (Library) and mlb (localinend) provide facilities for this, which is the thing I want to take advantage of
<gianp> Right
<rjsimmon> that's a real benefit of using Standard ML - we couldn't get away with that in c or haskell or java or well, anything
<rjsimmon> At least I think
<gianp> So how about just symlinking lib/1.x.x -> 1.6.9 and lib/2.x.x -> 2.9.1
<gianp> and then you can just depend on some major version instead of "current".
<gianp> And then resolve any intra-application conflicts in the way you propose.
<rjsimmon> I like this direction. It's possible you might need a feature that is 1.7.4, so you wouldn't want 1.6.9 to satisfy that dependency
<gianp> Indeed not, so you would reference 1.7.4 directly, not 1.x.x'
<rjsimmon> and it always keeps "lib/v1" simlinked to the max "lib/1.X.Y" (for lexicographically maximum values of X and Y), so you can also just reference "v1" and make sure you're updated
<gianp> Down the road we could imagine that you could get smackage to suggest some most-generic solution to your version constraint set.
<gianp> But for the moment I don't think it would be too much work to just resolve any weird special cases manually
<rjsimmon> that, in turn, basically makes the damn thing the same as (or at least compatible with) MLton Library Project's "transparent per-library branching"
<rjsimmon> http://mlton.org/MLtonLibraryProject
Should we try to support git as a first option/cut? Possibly subversion as well? Assuming we know that git://github.com/robsimmons/l10.git provides a package "elton", the following commands could probably be automated (I was using an existing package as a straw man; according to semantic versioning the tag would need to be named v0.0.1
, not elton-0.0.1
.)
$ git ls-remote git://github.com/robsimmons/l10.git
e3e2db6d1a6ba0c3c1a45fd64bf9aaaa62953897 HEAD
ebe7a4dffd3a3b6359d9764b2025fed8c503bddc refs/heads/bug2
23a7a503d0796d9fc007ef3704a065c25f325bd7 refs/heads/bug2-fix
18355dbaacad819903abdb70d31eab390576ffce refs/heads/bug2-workaround
3e1965bfa24ef2c3e51871bdd8b1cacd6fe1a3d0 refs/heads/bug3
efda5565d5a762de5ec6024750c914504d065a59 refs/heads/bug4
e3e2db6d1a6ba0c3c1a45fd64bf9aaaa62953897 refs/heads/master
6d53d37be90efcf27352e13c11b79d8f7af7cbaf refs/tags/elton-0.0.1
$ mkdir elton
$ mkdir elton/0.0.1
$ cd elton/0.0.1/
$ git init
Initialized empty Git repository in /private/tmp/elton/0.0.1/.git/
$ git remote add origin git://github.com/robsimmons/l10.git
$ git pull origin tags/elton-0.0.1
remote: Counting objects: 1451, done.
remote: Compressing objects: 100% (1148/1148), done.
remote: Total 1451 (delta 468), reused 1244 (delta 270)
Receiving objects: 100% (1451/1451), 549.23 KiB | 998 KiB/s, done.
Resolving deltas: 100% (468/468), done.
From git://github.com/robsimmons/l10
* tag elton-0.0.1 -> FETCH_HEAD
The following collects some more high-level thoughts on packaging for
Standard ML projects.
The first part introduces some conceptual ideas for structuring such a
system. The main idea is for packages to be persistent resulting in a
stable and recoverable system. For example, when installing a new
package it should never fail. If a package has once had a satisfiable
set of dependencies that configuration should remain available
regardless of what other packages a user might have installed on there
system.
The second part describes splitting the system in a library part and a
developer tool part. The main idea is that tools and developers have
distinct needs. The library for the system should have a stable and
programmable API whereas the main management tool should provide
ease-of-use in the form of automation, interactive control, reasonable
defaults, etc.
* Registries
A registry is a listing of SML packages.
Registries may provide several versions of the same package.
** Online registries
An online registry provides a source of downloadable packages. An
online package should be locally installed to enable use of the
particular package.
Online registries could be provided by http, git, hg, darcs, svn,
etc. For a version controlled source provider access to different
versions could be provided simply by using a tagging scheme.
** Offline registries
An offline registry provides read-only access to packages that can
be used for project compilation. The offline registry is a
file-based organization of packages. Once a package is installed it
should never change (ie, other versions, such as updates, are
installed alongside and independently of existing packages).
An offline package registry can be located anywhere on the
system. For convenience two special offline registries exist:
*** System global
(/usr/local/lib/smackage/; /usr/lib/smackage/; ...)
The system global registry is a special/privileged offline registry
such that installed packages can be viewed and shared by many users
of the same system.
*** User local
(~/.smackage/; ...)
The user local registry is special/privileged offline registry such
that a user may, without root access, install and reuse packages
for many projects.
* Development directories
A development directory is SMACKAGE enabled SML project directory.
It contains a special file (myproj.smack) which specifies the
project name, version, etc., and it contains a special directory
(.smackage) which holds all the packages needed to build the SML
project. For example:
./.smackage/foo -> /usr/lib/smackage/foo-v1.6.3
./.smackage/bar
./myproj.smack
./myproj.cm
./myproj.mlb
./src/lots-of-files.sml
The libraries could be an actual smackage package (such as bar) or
it could be a symlink to a specific package installed elsewhere on
the system (such as foo).
On top of the above abstractions it should be possible to support:
- Semantic Versioning: at installation time the best package version
match is selected and recorded as part of the package installation.
Later, an upgrade can force an existing package to reinstall with
newer dependencies.
- Rollback: any package update can be rolled back to the previous
installation since it will not actively remove previous version of
any package.
- Restricted version constraints: in addition nothing prevents
packages from forcing use of a specific package version.
- Local development: it is easy to change the package dependencies
on a "per project" basis and even to develop on existing libraries
at the same time. The directories inside _smackage are simply
updated to reflect the developers intentions without needing to
change anything in the system wide or user local setup.
- Online package browser: it should be straightforward to create a
package browser by supplying it with registry links that could be
used to populate the browser with. This would mean decentralized
package locations which has its pros and cons. The big win would
be that users could simply use github and a new releases would be
a tag in the repository.
Some problematic issues are:
- Garbage collection: since packages are persistent we need support
for garbage collecting unneeded packages. This could be provided
for the global/local registries (based on having the most recent
version of a each package and then marking their dependency
graph.) One could add support for registering other roots so
development directories could be taken into account too.
- Windows: the development directory structure assumes symlinks. We
could fix this on windows by adding support for a link file that
is specially parsed by the smackage system. However, these files
would appear a simple text files to the developer.
The system should be cleanly split in two parts. The first being a
library to support tool writing. The second part being a developer
tool for managing packages.
* The software package library (smacklib)
A stable API for managing packages and registries. This library is
for tooling and should not provide any command-line interface for
software developers. Also, this library is not a build
system. Issues related to compilation should be taken care of by
existing build systems and abstracted by this library.
Implementation independent description of:
- libraries and/or executables
- dependencies and resolution (eg, via semantic versioning)
- language dependencies and features
- compiler / build system abstractions
Should be independent of actual compilers, operating systems, and
package locations (ie, the environment must be completely abstract).
* The software package manager (smack)
Creation, installation and removal of packages.
This should provide an easy to use interface for software
developers, ala, cabal-install.
Should be integrated with package registries and mirrors thereof
(online and offline).
* Requirements
For maximal portability among SML implementations, the package
system should be written in Core SML + references with as few
dependencies as possible.
We want to have symlinks present for any version that is referenced in a 'require' directive. By example:
An end user wants to install a library 'foolib'. He just wants the latest version. He runs 'smack install foolib', and gets the latest version '1.7.4'. In his MLB/CM files, he can reference:
$(SMACKAGE)/foolib/v1
v1 will always be symlinked to the latest 1.x.x version.
The provider of foolib depends on barlib. The spec file for foolib contains a require line:
require barlib 2.9.x;
In the MLB/CM files for foolib, the author can now reference
$(SMACKAGE)/foolib/v2.9
And be assured that on any client system, v2.9 will point to the latest 2.9.x version available on the client system.
(Note: the above example would also work if the author required any 2.x.x version, as one could reference v2 as well)
We're taking the cheap-and-easy route for package indexes. Given that we already know how to parse spec files, a package index is just a bunch of specs concatenated together with some kind of delimiter. This will not scale infinitely, but it'll work for the kinds of scales we're talking about at this stage.
A Hoogle-like search engine would be a killer feature; essentially, it allows you to search on an approximate type signature and find the functions that has matching type signatures. Try this example search.