awwx.ws

What is the right way to do library dependencies?

Library dependencies are usually handled by having code start off with some “use”, “require”, or “import” statements which list which libraries the code needs. Then we release our code, often giving it a version number like “3.0.1”.

The code in each release is immutable; once I’ve released “3.0.1”, that’s it, it’s released. This is a good thing so that if I’m deploying my program with a library version 3.0.1 I know what code I’m actually using, and if you say “my program is working with your library 3.0.1 but not 3.0.2” then we all know what code we’re actually talking about.

In more dynamic languages I may be able to have my code choose between libraries. For example, my code might use a faster JSON::XS library if it’s available, but fall back to a slower JSON library otherwise.

OK, that’s all good so far. Now for the problems.

What if I write my code to use JSON and later a faster JSON::XS becomes available? Now I need to release a new version, perhaps version “3.0.3”, which knows about JSON::XS. But JSON::XS is a plug-in replacement for JSON; the library code hasn’t changed, it doesn’t even know what version it’s using. What we want is to be able to simply drop in JSON::XS in place of JSON; what we end up having to do is release new versions of all the libraries that use JSON.

In the future we can make better decisions because we have more information. So the question is, how do we make good decisions now with the information we have now, without making it hard to make better decisions later?

The usual reaction to the dependency problem is to invent some bureaucracy, rules that we hope will keep us out of trouble. “Every library should have a documented public API so that internals can be changed without affecting clients”. “Use interfaces so that one implementation can be substituted for another”. Etc. and so on.

This works OK, more or less, if we’ve correctly foreseen what changes we’ll want to make. But if there’s some other change that it turns out we need, or if we don’t need the changes we anticipate, then that work on the public API’s and interfaces is wasted.

And clutters up the code. It’s pretty easy to look at some piece of code and see that it’s not being used right now and take it out. But if it’s there for some possible future need, then it will get left in. And ironically, in my experience when the code turns out not to have anticipated the changes I need, they’re often harder to implement because the code is cluttered with unused mechanisms.

So, what to do? Let’s keep released code immutable since that makes it much easier to keep track of, but how about moving the knowledge about dependencies between libraries elsewhere, where they can be updated independently.

I threw together some toy code to implement a little knowledge base about dependencies between code. Once we have some data about dependencies (“A can use either B1 or B2 and A also needs C, and I like B2 better than B1, but B2 conflicts with C”) then we can have the computer do the boring grunt work of finding out what alternatives will work (“OK, load B1 and A”).

What sorts of things do we want to know about libraries and dependencies between them? Well, usually we can’t have more than one version of a library loaded at the same time.

(conflicts arc2 arc3 arc3.1)

And I may have some preferences about which versions I think are better, which version I’d prefer to use if I have a choice.

(prefer arc3.1 arc3 arc2)

And some versions are backward-compatible:

(replaces arc3 arc3.1)

Here’s a program I’m working on... it needs arc3 and my JSON parser:

(works myapp arc3 fromjson0)

This is read as “myapp works with arc3 and fromjson0”. It’s possible to describe other combinations; for example, I might say that myapp works with other implementations of Arc.

My JSON parser needs Arc and my parser combinator library, the parser combinator also needs Arc:

(works fromjson0 arc3 parsecomb0)
(works parsecomb0 arc3)

The versions of Arc need MzScheme:

(works arc2 mzscheme)
(works arc3 mzscheme)
(works arc3.1 mzscheme)

Though it would be more accurate to say what version of MzScheme is needed. And finally, I need to say that MzScheme works by itself:

(works mzscheme)

OK, enough info. What are my options?

arc> (resolve)
(mzscheme arc3.1 parsecomb0 fromjson0 myapp)
(mzscheme arc3 parsecomb0 fromjson0 myapp)
nil

Each line shows a different alternative of which library versions would be loaded, sorted with the highest preference listed first. My app needs arc3, and arc3.1 is backwards compatible with arc3, and I prefer arc3.1 to arc3, so that’s the option that gets listed first.

Let’s say someone comes along and implements a better, stronger, faster version of my JSON parser in Scheme:

(replaces fromjson0 scheme-json)
(works scheme-json mzscheme)

Now I have some more options:

arc> (resolve)
(mzscheme arc3.1 parsecomb0 fromjson0 myapp)
(mzscheme arc3.1 scheme-json myapp)
(mzscheme arc3 parsecomb0 fromjson0 myapp)
(mzscheme arc3 scheme-json myapp)
nil

Let’s not forget that the Scheme parser is better:

(prefer scheme-json fromjson0)

arc> (resolve)
(mzscheme arc3.1 scheme-json myapp)
(mzscheme arc3.1 parsecomb0 fromjson0 myapp)
(mzscheme arc3 scheme-json myapp)
(mzscheme arc3 parsecomb0 fromjson0 myapp)

Good, now the option to use the Scheme parser is on top.

But we’re hardly done yet. Arc is a language for hackers, and so naturally will have lots of hacks... along comes an alternative implementation of Arc:

(replaces arc3.1 rainbow)
(works rainbow java)
(works java)
(conflicts java mzscheme)

And now I have even more options:

arc> (resolve)
(mzscheme arc3.1 scheme-json myapp)
(mzscheme arc3.1 parsecomb0 fromjson0 myapp)
(mzscheme arc3 scheme-json myapp)
(mzscheme arc3 parsecomb0 fromjson0 myapp)
(java rainbow parsecomb0 fromjson0 myapp)

Now I can see that I can use Rainbow if I want, but I’ll have to use my JSON parser if I do.

Comment over at the Arc forum.


P.S.  If you’re curious, here’s my current toy implementation... it uses xloop0.arc, table-vivifier, testify-iso0.arc, and butlast0.arc and partition0.arc from Anarki.

(def clear ()
  (= have* (table))
  (= using* (table))
  (= need* (table)))

(def reset ()
  (clear)
  (= hackbase* nil))

(reset)

; lookup data from the hackbase

(def hackbase-verb (v)
  (map cdr (keep [is _.0 v] hackbase*)))

(def hackbase-verb-hack (verb hack)
  (keep [and (is _.0 verb) (is _.1 hack)] hackbase*))

; all combinations of picking two items from a list, treating the
; result as a set: don't return both (a b) and (b a)
; (combinations '(a b c d))
; => ((a b) (a c) (a d) (b c) (b d) (c d))
; is there some standard name for this?

(def combinations2 (lst)
  (accum a
    (xloop (lst lst)
      (when (> len.lst 1)
        (let one car.lst
          (each two cdr.lst
            (a (list one two))))
        (next cdr.lst)))))

(def add-conflict (a b)
  (if (is a b)
       (err "hack can't conflict with itself" a))
  (set (conflicts* (list a b)))
  (set (conflicts* (list b a))))

(def compute-conflicts ()
  (= conflicts* (table))
  (each hacks (+ (hackbase-verb 'conflicts) (hackbase-verb 'replaces))
    (each (one two) (combinations2 hacks)
      (add-conflict one two))))

(def better-than (a)
  (keys better*.a))

(def worse-than (a)
  (keys worse*.a))

; A is better than B

(def add-preference (a b)
  (if (is a b)
       (err "oops, hack can't be better than itself" a))

  (if (or worse*.b.a better*.a.b)
       (err "conflicting preferences" a b))

  (set worse*.a.b)
  (set better*.b.a)

  ; A is better than B, so A is also better than everything B
  ; is better than

  (map [add-preference a _] (worse-than b))

  ; and B is worse than A, so B is also worse than everything
  ; A is worse than

  (map [add-preference _ b] (better-than a)))

(def compute-preferences ()
  ;; a -> b -> t  B is worse than A
  (= worse* (table nil (fn () (table))))

  ;; a -> b -> t  B is better than A
  (= better* (table nil (fn () (table))))

  (each hacks (hackbase-verb 'prefer)
    (each (one two) (combinations2 hacks)
      (add-preference one two))))

(def recalc ()
  (compute-conflicts)
  (compute-preferences))

(recalc)

; add data to the hackbase

(def record (datum)
  (push datum hackbase*)
  (recalc)
  nil)

(def unrecord ((o datum))
  erp.datum
  (if datum
       (= hackbase* (rem datum hackbase*))
       (= hackbase* (butlast hackbase*)))
  (recalc))

(mac conflicts hacks
  `(record `(conflicts ,@',hacks)))

(mac prefer hacks
  `(record `(prefer ,@',hacks)))

(mac replaces (hack1 hack2)
  `(record `(replaces ,',hack1 ,',hack2)))

(mac works hacks
  `(record `(works ,@',hacks)))

(mac patches (hack1 hack2)
  `(record `(patches ,',hack1 ,',hack2)))

(mac forget (datum)
  `(unrecord ',datum))

; Returns a list of the different ways the prerequisites to the hack
; can be fulfilled, not including replacements.
; For example, if A needs C and D, or will also work with just E,
; then (works-with 'a) => ((c d) (e))

(def works-with (hack)
  (map [cddr _] (hackbase-verb-hack 'works hack)))

(def alternatives-to (hack)
  (map [_ 2] (hackbase-verb-hack 'replaces hack)))

(def all-alternatives (hack)
  (cons hack (mappend all-alternatives (alternatives-to hack))))

; select which hacks you need

(mac have hacks
  `(do (map [set (have* _)] ',hacks)
       nil))

(mac use hacks
  `(do (map [set (using* _)] ',hacks)
       nil))

(mac need hacks
  `(do (map [set (need* _)] ',hacks)
       nil))

(def beginning-branch ()
  (+ (map [list 'have _] (keys have*))
     (map [list 'use  _] (keys using*))
     (map [list 'need _] (keys need*))))

; the code looks for solutions by exploring branches
;
; a branch is a list of
; (have hack) -  a hack we have (it's already loaded)
; (using hack prereq...) - a hack we've committed to load in this branch
;    and its resolved dependencies (each prereq is already a "have"
;    or "using" hack in this branch)
; (use hack) - we want to use this hack
; (need hack) - we want to use this hack or one of its replacements

(def pull-todo (branch)
  (iflet p (some (fn (verb)
                   (pos [is (car _) verb] branch))
                 '(use need))
    (list (branch p) (+ (cut branch 0 p) (cut branch (+ p 1))))))

(def committed-hacks (branch)
  (trues [aif (in (car _) 'have 'using) (cadr _)] branch))

(def hack-conflicts (branch hack)
  (some [conflicts* (list hack _)] (committed-hacks branch)))

; list best alternatives first

(def alternatives (hack)
  (sort (fn (a b) better*.b.a)
        (all-alternatives hack)))

; enumerate all combinations of lists in ls
; (cross '((a b c) (d e) (f)))
; => ((a d f) (a e f) (b d f) (b e f) (c d f) (c e f))

(def cross2 (l1 l2)
  (accum a
    (each x1 l1
      (each x2 l2
        (a (cons x1 x2))))))

(def cross (ls)
  (if (no ls)
       nil
      (no (cdr ls))
       (map list (car ls))
       (cross2 (car ls) (cross (cdr ls)))))

; t if the branch is already committed to providing this hack

(def branch-has (branch hack)
  (some hack (committed-hacks branch)))

(def cross-alternatives (hacks)
  (cross (map [alternatives _] hacks)))

(def all-prereq-options (hack)
  (mappend [if _
                (cross-alternatives _)
                '(nil)]
           (works-with hack)))

(def rem-conflicts (branch options)
  (rem [some [hack-conflicts branch _] _] options))

; Return a list of the different ways the prerequisites to the hack
; can be fulfilled.  Goes down just one level (doesn't return the
; prerequisites of the prerequisites).

; For example, if A needs C and D, or will also work with just E, and
; F is a replacement for C, then
; (prereq-options nil 'a) => ((c d) (f d) (e))
;
; But, if a hack has no prerequisites, return one solution of an empty list
; to signify no prerequisites are needed.
; (prereq-options nil 'mzscheme) => (nil)

; Would like to sort this, putting more preferred prereq options first

(def prereq-options (branch hack)
  ; probably need to dedup by set instead of by list
  (dedup
   (rem-conflicts branch
    (all-prereq-options hack))))

; convenient debugging point

(def take-next-step (branch)
  (resolve-step branch))

(def add-prereqs ((using hack . old-prereqs) new-prereqs)
  `((using ,hack ,@(dedup (+ old-prereqs new-prereqs)))))

(def branch-add-using (branch hack prereqs)
  (iflet p (pos [and (is (car _) 'using) (is (cadr _) hack)] branch)
    (+ (cut branch 0 p) (cut branch (+ p 1)) (add-prereqs (branch p) prereqs))
    (+ branch `((using ,hack ,@prereqs)))))

(def resolve-use (branch (hack))
  (unless (hack-conflicts branch hack)
    (each prereqs (prereq-options branch hack)
      (let b (+ branch (mappend (fn (prereq)
                                  (unless (branch-has branch hack)
                                    `((use ,prereq))))
                                prereqs))
        (let b (branch-add-using b hack prereqs)
          (take-next-step b))))))

(def resolve-need (branch (hack))
  (each hack (alternatives hack)
    (resolve-use branch (list hack))))

(def resolve-step (branch)
  (iflet (todo branch) (pull-todo branch)
    ((case (car todo) use resolve-use need resolve-need)
     branch (cdr todo))
    (report* branch)))

(def load-order (branch)
  (xloop (result nil
          todo (trues [if (is (car _) 'using) (cdr _)] branch))
    (let (fulfilled rest) (partition (fn ((hack . prereqs))
                                       (all [mem _ result] prereqs))
                                     todo)
      (when (no fulfilled) (err "unresolved prereqs" todo))
      (let result (+ result (map [car _] fulfilled))
        (if rest
             (next result rest)
             result)))))
                                        
; maybe generate lots of solutions, and then show the top ones?

(def resolve ()
  (let limit 12
    (catch
     ; my laziness knows no bounds
     (= report* (fn (branch)
                  (write (load-order branch))
                  (prn)
                  (if (< (-- limit) 1) (throw nil))))
     (resolve-step (beginning-branch)))
    (wipe report*)))