Explaining some features of asdf-dependency-grovel

Some people asked me why asdf-dependency-grovel (abbreviated adg, to save my fingers) merges systems. Since I wrote it, a few more questions came up, and so I’ll try to answer them.

Why does adg merge systems?

ASDF has a useful dependency-tracking mechanism: if a component changes, it automatically rebuilds that component and all components that depend on it. But what happens to dependencies between systems? They are problematic for ASDF, as it doesn’t track the need to recompile components across system boundaries. For example:

(defsystem foo
   :components ((:file "a")))

(defsystem bar
   :depends-on (:foo)
   :components ((:file "b")))

Suppose that file “a” in system FOO changes; if you load system BAR, file “a” will be recompiled; but file “b” will not! If file “b” uses a macro from file “a”, you will load the old version of that macro from FASLs, and things will break. Ow.

There are two solutions for this problem:

  • Make ASDF track compilation across system boundaries - expensive, as every user would need to update asdf, and now every user would have to sit through endless recompilation sessions if only a comment in a low-level system changes.
  • Merge systems where it makes sense. adg ensures that dependencies are kept up-to-date and minimal: only those things that are actual compile-time (or load-time) dependencies are recorded.

This exact problem is what bit us in mcclim once. I spent a long time thinking about it and came up with adg in the end. So there.

How does it merge systems?

It merges systems in the same way it operates when not merging systems, really: asdf recursively propagates an instrumened load-op (which translates to a compile-op for uncompiled sources) down the tree of dependencies, which ensures that adg can collect references and definitions. When a definition or a dependency is recorded, adg checks whether it’s in a list of “interesting” systems and omits those components and dependencies that are not interesting.

Why not just use XREF?

XREF is a mechanism provided by many CL implementations which allows queries like who-calls and who-macroexpands. who-macroexpands would cover a good part of compile-time dependencies, but there are many more types of compile-time dependencies! For example:

  • Definition of a method that uses a class defined in another file
  • Definition of a package that uses another package
  • Use of a symbol interned in a package defined in another file
  • Calling (at compile or load time) a function defined in another file
  • many more…

I do not know a single XREF mechanism that has queries for all these things, and so adg has to grovel to the compiler for them. Thanks to macroexpand-hook, they are pretty easy to find out (-:

Can I have a simple example for using adg?

Sure thing. Assuming you have one (or more) horribly long :serial t system(s):

 (defsystem something-awful
   :serial t
   :components ((:module "foo" :components (#| lots #| )))

You rename the system to indicate that it’s the serial one, then add this argument to the defsystem form: :default-component-type asdf-dependency-grovel:instrumented-cl-source-file, and replace occurrences of :module with asdf-dependency-grovel:instrumented-module. The result would look like this:

 (defsystem something-awful/serial
   :serial t
   :default-component-class asdf-dependency-grovel:instrumented-cl-source-file
   :components ((asdf-dependency-grovel:instrumented-module "foo" :components (#| lots #| )))

(If it’s more than one system, you have to do the previous step for all the systems that are involved.) Then you define a second system that discovers the dependencies:

 (defsystem something-awful/dependencies
   :components ((component-file "something-awful"
                :output-file #p"something-awful-components.lisp-expr"
                :load-system something-awful/serial ; the system to load. should depend-on all the merged systems.
                :merge-systems (something-awful/serial) ; if you have more than one system, list them here
                :cull-redundant t ; remove unnecessary dependencies? Makes for easier-to-read component files
                :verbose nil ; silly debugging output
 )))

I suggest you put the definitions for the /dependencies system and the /serial system(s) in a separate file, and then, in the original system definition file, define your new something-awful system like this:

 (defsystem something-awful
   :components
   #.(let ((component-file (make-pathname :name "something-awful-components"
                                          :type "lisp-expr"
                                          :defaults *load-truename*)))
       (when (probe-file component-file)
         (with-open-file (f component-file :direction :input)
           (read f)))))

And that’s it. You can now load the separate file, run (asdf:oos 'asdf:dependency-op :something-awful/dependencies) and have it emit the component information into something-awful-components.lisp-expr. Done! Your users can now load the new system and hack on it, and ASDF can rely on the dependency information in that file.

You should re-generate the component file (using the :dependency-op) in these cases:

  • After hacking on something and incrementally compiling, the system breaks. This probably means that a compile/load-time dependency was introduced somewhere down the line.
  • After adding a new file. This requires that you find a sensible place for it in the serial order of the /serial system, then have adg re-generate component info.

"make depend" for lisp

As the software-publishing planet.lisp.org crowd probably knows, writing simple defsystems with ASDF is pretty easy. Dependencies are not hard to find (and to specify), if you have up to 10 or 20 components. Beyond that, though, it becomes pretty painful to maintain a system definition file that doesn’t result in a compilation error. After that, it’s easier to use a serial system definition: just find a defined order to compile and load the files.

There’s a tradeoff, though: Serial system definitions are a pain for users who want to hack on your code. If someone changes a file (e.g. one close to the start of the series of components), every single component after it in the series must be recompiled. Dependencies would help, but we already established that they’re too hard to maintain by hand in a large system. What’s a system definition maintainer to do?

Have the computer do the dirty work, of course (-:

A few months ago, I had a pretty neat idea: To find out the compile-time dependencies in a system, you’d have to hook into the compiler. And the compiler provides one such hook: macroexpand-hook - of course, all the operators that can construct a compile-time dependency must be macros - and all the standard operators are!

So, I wrote a program called asdf-dependency-grovel that compiles a serial asdf system (or an asdf system that just so happens to be in working order), and extracts components with dependency information.

Here’s an outline of what it does for the prime example of a compile-time dependency: a file uses a macro that is defined in another file:

    (defmethod asdf:perform :around ((op asdf:compile-op) (comp asdf:cl-source-file))
      (let* ((old-hook *macroexpand-hook*)
             (*macroexpand-hook*
              (lambda (fun form env)
                 (when (listp form)
                   (case (first form)
                     ((defmacro)
                      (signal-macroexpansion 'provides (second form) (first form) comp))
                      ;; many many more form types cut
                     (t (signal-macroexpansion 'uses (second form) (first form) comp))))
                 (funcall old-hook fun form env))))
        (call-next-method)))

And all that signal-macroexpansion does is send a little notice to the function that keeps track of dependencies (i.e. it invokes a closure on a hook) to tell it that there’s either a use of a previously defined macro from the current component, or a new definition from from the current component.

It has additional handlers for:

  • defclass and define-condition (“use” of superclasses, and definition of classes for use by defmethod and other defclass forms)
  • defpackage and in-package.
  • defun - it rewrites the function’s macroexpansion into code that signals a compile-time use.
  • defmethod and defgeneric (“use” of generic functions and classes on which a method specializes),
  • defconstant - makes the constant a symbol-macro signals the variable was used and the constant’s value.

This code managed to automatically generate a working dependency graph for mcclim, even merging 8 pretty large systems into one in the process. The resulting system now contains a total of 168 components with 192 non-redundant dependencies!

If you want to try your luck with asdf-dependency-grovel, check out the cliki page.

Also, here are two graphs of the new McCLIM system’s dependencies and the CLX system’s dependencies.

A break for advertising (and a bit of synchronicity)

(No Lisp content here, move along, planet.lisp reader…)

For a few months now, I’ve been heckling Mac DVD burning software authors to provide full DVD+RW support (as in, the ability to append files to a burned DVD+RW, the way growisofs and its amazing front-end, k3b do on Linux). To no avail. I got all sorts of lame excuses, from lack of hardware support to no interest.

To spite those lazy burning application authors, I decided to do it myself. Using the freshly ported growisofs (lack of hardware support, hah), I wrote a horribleproof-of-concept application that was able to add files to the file system of a DVD+RW medium. When I discovered that the author of BurnAgain (which did what I wanted, only for CDs) lives only a few streets away from me, I immediately pitched my idea to him. And he agreed! I gave him my proof of concept, and now he has delivered a complete product: Behold the shiny BurnAgain DVD!

If you have a Mac, and are interested in DVD creation, you really should buy it (or give it a try, it comes with 20 trial burns). No other product (not the Finder, not Toast and certainly not Disco) can do what it does: Incrementally add files to a DVD medium.

McCLIM 0.9.4 "Orthodox New Year" released!

We released McCLIM 0.9.4 today. You may be wondering what’s so cool about it this time, so here’s a short list:

  • A new input editor and editing substrate called DREI (covered here before),
  • several great improvements to gtkairo (see lisp porn here), and
  • many cool new features and bug fixes, including a few clim 2.2 functions.

(Of course, there are probably lots of new bugs in there, too. Please let us know about them at mcclim-devel at common-lisp.net!)

(And of course, the release announcement has the obligatory editing-under-stress error. You will get the following reward for finding it:
)