Skip to content
robsimmons edited this page Sep 27, 2012 · 28 revisions

Smackage exists! See https://github.com/standardml/smackage! If you're a developer the beginning documentation for how to adapt your project to Smackage can be found on the Smackage wiki.

This page will stick around with incomplete/foolish/wrong information for the sake of posterity.

Archive

There currently exists no standard way in which to package, discover and install SML libraries. We think this is worth fixing, as this might help SML libraries gain more traction, and help them to be maintained.

If this is something that you might be interested in working on, talk to gianp, rjsimmon or sully on #sml on Freenode.

Considerations:

  • Which platforms do we support? SML/NJ and MLton?
  • How much do we try to do? Just provide search over an index of packages and a standard way to fetch and unpack them? Or something more sophisticated?
  • How does this interact with existing platform-specific packaging systems?
  • Are there SML-specific artifacts we could usefully expose (e.g. signatures?)
  • Naming hierarchy and signatures?
    • Python style?
    • Java style?
    • --- something else?

What would this look like?

Here's a very basic thought or two that doesn't manage to take advantage of SML in any way. If people have experience with other versioning systems, they should chime in. Assuming people used semantic versioning in the correct way as described by http://semver.org/ , it seems like it would incorporate into the repository for "awesometool" a file awesometool.smackspec, with the following contents.

require cmlib 1.0.2;
require utf8lib 2.1.5;

If we then ran, as part of a makefile process, perhaps, or just independently, a command like this:

$ smack awesometool.smackspec

Smackage would then need to know how to seek out cmlib and utf8lib. If it currently had lying around (on a Mac)

/Users/Me/Library/Standardml/smackage/cmlib-1.1.3
/Users/Me/Library/Standardml/smackage/utf8lib-2.1.5

Then it could just symlink cmlib-1.1.3 to cmlib-1.0.2, since the later should be backwards compatible with the former, but would need to download either utf8lib-2.1.5 (or, maybe, the more recent utf8lib-2.4.9 and create a symlink).

That's all we do: put code into the right place and handle (transitive, versioned) dependencies, counting on the fact that people are using semantic versioning correctly (mainly, this means requiring that version 2.1.5 is required to be backward-compatible with 2.0.3, but 3.0.1 is not). Let's assume that both of these libraries provide the facility to be run with ML basis files and with CM files.

MLton/ML Basis

There's a one time configuration where we update the global mlb-path-map file (it lives at /usr/local/lib/mlton/mlb-path-map on Rob's MacBook) with the line

SMACKAGE /Users/Me/Standardml/smackage

Then, our own sources.mlb file could include

$(SMACKAGE)/cmlib-1.0.2/cmlib.mlb
$(SMACKAGE)/utf8lib-2.1.5/sources.mlb

Standard ML of New Jersey/CompilerManager

There's a one time configuration where we add to our local ~/.smlnj-pathconfig file the single line

SMACKAGE /Users/Me/Standardml/smackage

Then, our own sources.cm file could include

$SMACKAGE/cmlib-1.0.2/cmlib.cm
$SMACKAGE/utf8lib-2.1.5/sources.mlb

Three cheers for Sully, who figured out what an smlnj-pathconfig line was.

Implementation

I (Gian) think it's safe to release something really simple as version 0.0.1. A basic curl wrapper that supports semantic versioning and very basic dependency resolution is a big improvement over what we currently have. We could back it off Git initially, and look at issues like mirroring and other delivery methods once we have anybody actually using it.

A first attempt at a package spec format (this syntax will probably change to something like YAML):

provides <package> <version>;
description "...";
maintainer "...";
keywords: "..."; 
upstream-version: "..."; (probably a github address)
upstream-url: "..."; (a project/maintainer homepage if we're lucky)
git "...";
svn "...";
hg "...";
cvs "...";
doc-url "..."; (either package relative or a URL).
bug-url "..."; (a link to a bug reporting interface somewhere)
license ...;
require <package> <version>;
platform <SML environment> <version>; (e.g. MLton 20100608, smlnj 110.46)
build <package-relative-path-to-command>;
test <package-relative-path-to-command>;
install <package-relative-path-to-command>;
uninstall <package-relative-path-to-command>;
doc <package-relative-path-to-command>;

Are there other things we should include? Many of these values will probably be unused initially. It's just a way of having some good-quality meta-data for packages from day one. We can figure out better ways to use this data eventually.

Presumably running 'smack' as root should install packages globally, whereas running as a normal user will install somewhere in the user's home directory?

  • rjsimmon - This seems plausible. Something I'd point out is that there's a "simplest possible" version of this that doesn't even understand that it's about standard ml - it's just about getting library code and can't be used for getting applications. This would take away "platform" "build" "test" "install" "uninstall" and maybe even "doc", and could be a simpler core ("stuffage," perhaps) upon which smackage could fork but which could be adapted to other uses.
  • gianp - I agree in theory. I originally just had pre- and post-install hooks, but the idea to separate them out was to make it more compatible with things like deb packages. "platform" is there to make it easy to distinguish between a (for example) GTK+ binding that only works with MLton, versus one that works for SML/NJ, or one that works for both. It's to permit me to search for a library that works with my chosen development platform, so I think that's still needed in either case. "doc" might be just running 'make doc' to generate a texinfo manual. Or it might well do nothing in version 0.0.1 :)

A conversation about "current versions"

<rjsimmon> Also, there's a problem in that if we don't specify a *specific* version, how will the user's .mlb file know what path to use to look for the correctly-versioned thing?
<gianp> rjsimmon, Yep, I thought that would be fine.  I just implemented the semantic versioning spec, we don't need to use it all :)
<rjsimmon> oh, okay :-)
<gianp> rjsimmon, I was thinking we symlink .../libname/current to .../libname/1.5.1/
<rjsimmon> then you run into a haskell problem:
<gianp> Then you either include libname-1.5.1.{mlb,cm} or libname-current
<rjsimmon> what if I need package A and B, but A needs C version 1.0-1.5 and B needs C version 1.6-1.9
<rjsimmon> they both hardlink to "C/current"
<rjsimmon> but that can only be one thing.
<gianp> Hrm, you're right.
<rjsimmon> or "C-current", I guess, to follow your example
<gianp> So what about having smackage generate cm/mlb stanzas against concrete versions instead?
<rjsimmon> Possible. However, I want to see if we can *not* generate code as a simplicity measure.
<gianp> So when I build foo, it requires 1.0-1.5, so I just grab any version that satisfies that and return it.  If I just require >= 1.5, then I'll get whatever the latest installed version is right now.
<gianp> Well, it's not so much generating code as much as it is outputting lists of installed packages.
<gianp> but I agree in principle.
<rjsimmon> like, I became very enamored with the semantic versioning principle that you maintain interface compatibility within a major release
<gianp> Yes, agreed.
<rjsimmon> that means that the "signature" for 1.2.1 can be satisfied by any "structure" 2.0.0, and so I can just hard-refer to 1.2.1.
<rjsimmon> packages that depend on a "2.x.x" and a "1.x.x" release of the same package (transitively) will have to make sure that the two versions can never see each other, but CM (Library) and mlb (localinend) provide facilities for this, which is the thing I want to take advantage of
<gianp> Right
<rjsimmon> that's a real benefit of using Standard ML - we couldn't get away with that in c or haskell or java or  well, anything
<rjsimmon> At least I think
<gianp> So how about just symlinking lib/1.x.x -> 1.6.9 and lib/2.x.x -> 2.9.1
<gianp> and then you can just depend on some major version instead of "current".
<gianp> And then resolve any intra-application conflicts in the way you propose.
<rjsimmon> I like this direction. It's possible you might need a feature that is 1.7.4, so you wouldn't want 1.6.9 to satisfy that dependency
<gianp> Indeed not, so you would reference 1.7.4 directly, not 1.x.x'
<rjsimmon> and it always keeps "lib/v1" simlinked to the max "lib/1.X.Y" (for lexicographically maximum values of X and Y), so you can also just reference "v1" and make sure you're updated
<gianp> Down the road we could imagine that you could get smackage to suggest some most-generic solution to your version constraint set.
<gianp> But for the moment I don't think it would be too much work to just resolve any weird special cases manually
<rjsimmon> that, in turn, basically makes the damn thing the same as (or at least compatible with) MLton Library Project's "transparent per-library branching"
<rjsimmon> http://mlton.org/MLtonLibraryProject

How do we get stuff from the internet?

Should we try to support git as a first option/cut? Possibly subversion as well? Assuming we know that git://github.com/robsimmons/l10.git provides a package "elton", the following commands could probably be automated (I was using an existing package as a straw man; according to semantic versioning the tag would need to be named v0.0.1, not elton-0.0.1.)

$ git ls-remote git://github.com/robsimmons/l10.git
e3e2db6d1a6ba0c3c1a45fd64bf9aaaa62953897	HEAD
ebe7a4dffd3a3b6359d9764b2025fed8c503bddc	refs/heads/bug2
23a7a503d0796d9fc007ef3704a065c25f325bd7	refs/heads/bug2-fix
18355dbaacad819903abdb70d31eab390576ffce	refs/heads/bug2-workaround
3e1965bfa24ef2c3e51871bdd8b1cacd6fe1a3d0	refs/heads/bug3
efda5565d5a762de5ec6024750c914504d065a59	refs/heads/bug4
e3e2db6d1a6ba0c3c1a45fd64bf9aaaa62953897	refs/heads/master
6d53d37be90efcf27352e13c11b79d8f7af7cbaf	refs/tags/elton-0.0.1
$ mkdir elton
$ mkdir elton/0.0.1
$ cd elton/0.0.1/
$ git init
Initialized empty Git repository in /private/tmp/elton/0.0.1/.git/
$ git remote add origin git://github.com/robsimmons/l10.git
$ git pull origin tags/elton-0.0.1
remote: Counting objects: 1451, done.
remote: Compressing objects: 100% (1148/1148), done.
remote: Total 1451 (delta 468), reused 1244 (delta 270)
Receiving objects: 100% (1451/1451), 549.23 KiB | 998 KiB/s, done.
Resolving deltas: 100% (468/468), done.
From git://github.com/robsimmons/l10
 * tag               elton-0.0.1 -> FETCH_HEAD

Some thoughts on design

The following collects some more high-level thoughts on packaging for
Standard ML projects.

The first part introduces some conceptual ideas for structuring such a
system. The main idea is for packages to be persistent resulting in a
stable and recoverable system. For example, when installing a new
package it should never fail. If a package has once had a satisfiable
set of dependencies that configuration should remain available
regardless of what other packages a user might have installed on there
system.

The second part describes splitting the system in a library part and a
developer tool part.  The main idea is that tools and developers have
distinct needs. The library for the system should have a stable and
programmable API whereas the main management tool should provide
ease-of-use in the form of automation, interactive control, reasonable
defaults, etc.


* Registries

  A registry is a listing of SML packages.
  Registries may provide several versions of the same package.


** Online registries

   An online registry provides a source of downloadable packages. An
   online package should be locally installed to enable use of the
   particular package.

   Online registries could be provided by http, git, hg, darcs, svn,
   etc.  For a version controlled source provider access to different
   versions could be provided simply by using a tagging scheme.


** Offline registries

   An offline registry provides read-only access to packages that can
   be used for project compilation.  The offline registry is a
   file-based organization of packages. Once a package is installed it
   should never change (ie, other versions, such as updates, are
   installed alongside and independently of existing packages).

   An offline package registry can be located anywhere on the
   system. For convenience two special offline registries exist:


*** System global

    (/usr/local/lib/smackage/; /usr/lib/smackage/; ...)

    The system global registry is a special/privileged offline registry
    such that installed packages can be viewed and shared by many users
    of the same system.

*** User local

    (~/.smackage/; ...)

    The user local registry is special/privileged offline registry such
    that a user may, without root access, install and reuse packages
    for many projects.


* Development directories

  A development directory is SMACKAGE enabled SML project directory.
  It contains a special file (myproj.smack) which specifies the
  project name, version, etc., and it contains a special directory
  (.smackage) which holds all the packages needed to build the SML
  project. For example:

   ./.smackage/foo -> /usr/lib/smackage/foo-v1.6.3
   ./.smackage/bar
   ./myproj.smack
   ./myproj.cm
   ./myproj.mlb
   ./src/lots-of-files.sml

  The libraries could be an actual smackage package (such as bar) or
  it could be a symlink to a specific package installed elsewhere on
  the system (such as foo).

  
On top of the above abstractions it should be possible to support:

  - Semantic Versioning: at installation time the best package version
    match is selected and recorded as part of the package installation.
    Later, an upgrade can force an existing package to reinstall with
    newer dependencies.

  - Rollback: any package update can be rolled back to the previous
    installation since it will not actively remove previous version of
    any package.
  
  - Restricted version constraints: in addition nothing prevents
    packages from forcing use of a specific package version.

  - Local development: it is easy to change the package dependencies
    on a "per project" basis and even to develop on existing libraries
    at the same time. The directories inside _smackage are simply
    updated to reflect the developers intentions without needing to
    change anything in the system wide or user local setup.

  - Online package browser: it should be straightforward to create a
    package browser by supplying it with registry links that could be
    used to populate the browser with. This would mean decentralized
    package locations which has its pros and cons.  The big win would
    be that users could simply use github and a new releases would be
    a tag in the repository.


Some problematic issues are:

  - Garbage collection: since packages are persistent we need support
    for garbage collecting unneeded packages. This could be provided
    for the global/local registries (based on having the most recent
    version of a each package and then marking their dependency
    graph.) One could add support for registering other roots so
    development directories could be taken into account too.

  - Windows: the development directory structure assumes symlinks. We
    could fix this on windows by adding support for a link file that
    is specially parsed by the smackage system. However, these files
    would appear a simple text files to the developer.


The system should be cleanly split in two parts. The first being a
library to support tool writing. The second part being a developer
tool for managing packages.

* The software package library (smacklib)

  A stable API for managing packages and registries. This library is
  for tooling and should not provide any command-line interface for
  software developers. Also, this library is not a build
  system. Issues related to compilation should be taken care of by
  existing build systems and abstracted by this library.

  Implementation independent description of:
  - libraries and/or executables
  - dependencies and resolution (eg, via semantic versioning)
  - language dependencies and features
  - compiler / build system abstractions

  Should be independent of actual compilers, operating systems, and
  package locations (ie, the environment must be completely abstract).


* The software package manager (smack)

  Creation, installation and removal of packages.

  This should provide an easy to use interface for software
  developers, ala, cabal-install.

  Should be integrated with package registries and mirrors thereof
  (online and offline).


* Requirements

  For maximal portability among SML implementations, the package
  system should be written in Core SML + references with as few
  dependencies as possible.

How version symlinks work

We want to have symlinks present for any version that is referenced in a 'require' directive. By example:

End user

An end user wants to install a library 'foolib'. He just wants the latest version. He runs 'smack install foolib', and gets the latest version '1.7.4'. In his MLB/CM files, he can reference:

$(SMACKAGE)/foolib/v1

v1 will always be symlinked to the latest 1.x.x version.

Library author 1

The provider of foolib depends on barlib. The spec file for foolib contains a require line:

require barlib 2.9.x;

In the MLB/CM files for foolib, the author can now reference

$(SMACKAGE)/foolib/v2.9

And be assured that on any client system, v2.9 will point to the latest 2.9.x version available on the client system.

(Note: the above example would also work if the author required any 2.x.x version, as one could reference v2 as well)

Package indexes

We're taking the cheap-and-easy route for package indexes. Given that we already know how to parse spec files, a package index is just a bunch of specs concatenated together with some kind of delimiter. This will not scale infinitely, but it'll work for the kinds of scales we're talking about at this stage.

Hoogle-like search

A Hoogle-like search engine would be a killer feature; essentially, it allows you to search on an approximate type signature and find the functions that has matching type signatures. Try this example search.