Issue 12

by Alaric Snell-Pym

2010-11-15 +0000

0. Introduction

Welcome to issue 12 of the Chicken Gazette, live from icebound rural England!

1. The Hatching Farm

2. Core development

Other than the usual fixing of minor bugs, it has been decided that the ability to statically build eggs is not worth maintaining; the first step, removing documentation for this facility, has now been done.

But the big news is that version 4.6.3 has been tagged. Here's the changelog:

- Peter Bex has cleaned up the makefiles heavily, making the
  build more maintainable and easier to modify; thanks to all
  who helped testing this new build
- renamed the makefile to `GNUmakefile' to catch using the
  a make(3) other than GNU make
- `-fwrapv' is disabled on OpenBSD, since the default compiler
  does not support this option (thanks to Christian Kellermann)
- on Solaris `gcc' is used by default, override `C_COMPILER'
  to use the Sun compiler instead
- added the new foreign type `size_t' (suggested by Moritz 
  Heidkamp)
- added the missing `unsigned-integer64' foreign type (thanks
  to Moritz for catching this)
- added support for proxy-authentification to `chicken-install'
  (thanks to Iruata Souza)
- `define-record' now allows defining SRFI-17 setter procedures
  for accessing slots
- removed the stub definition for `define-macro'
- added new foreign type `pointer-vector' which maps to `void **'
  and provided a low-level API in the `lolevel' library unit
  for manipulating pointer vectors
- declaring a function `notinline' will prevent direct-call
  optimization for known procedure calls
- when compiling in C++ mode, the compiler will be called with
  the `-Wno-write-strings' option
- the expansion of DSSSL lambda-lists uses now `let-optionals*'
  internally instead of `let-optionals' and so allows back-references
  to earlier formal variables; this also results in faster and
  more compact code for argument-list destructuring (thanks to Alan
  Post)
- Peter Bex has contributed various bugfixes and performance
  enhancements to the `irregex' library unit
- fixed bug in `getter-with-setter' that modified the first argument
  if it already had a setter procedure attached
- added a SRFI-17 setter to `list-ref'
- control-characters in symbol-names are now properly escaped if
  the symbol is printed readably (thanks to Alaric Snell-Pym Blagrave 
  for pointing this out)
- added literal blob syntax ("#{ ... }")
- `delete-directory' now optionally deletes directories recursively
- fixed incorrect size of internal data vector used in `time'
  (thanks to Kon Lovett)
- `list-tail' gives now a better error message when passed a non-list
  argument
- deadlock in the scheduler now terminates the process instead of
  attempting to throw an error
- added some sanity checks to the scheduler
- when installing from a local directory `chicken-install' now removes
  existing `*.so' files in that location to avoid stale binaries when
  the `make' syntax is used in setup scripts

The source code for the development snapshot may be had from http://code.call-cc.org/dev-snapshots/.

Since 4.6.3 was tagged off, work has continued on the experimental and main branches, in preparation for 4.6.4; further work on automatic unboxing, more scheduler robustness, and further minor fixes.

3. Chicken Talk

The awesomely-named Joe Python, on chicken-users, spotted a problem installing the iup egg. Thomas Chust instantly replied that it was due to a missing dependency, but the error message was misleading; a fix has been added to the egg's trunk version. However, the problem did not abate, and the thorny issue of how to correctly guess the required linker flags in all sorts of different build environments came up, but no long-term solution was forthcoming. CSC_OPTIONS is the way to go, for now, but integration with native package systems is an intriguing possiblity, in effect building per-platform lists of linkage flags for each egg.

Joe then asked how find-library worked; Felix explained.

Issues with eggs depending on recent versions of chicken came up in Imran Rafique's question about chicken-doc-admin errors. Imran's issue was hopefully resolved, but us egg authors might need to be careful about depending upon recent features too soon!

With amazing timing, Hugo Arregui asked about how to upgrade eggs after a Chicken upgrade; the conclusion was just to reinstall eggs as needed if they didn't work any more, as the binary version numbering system should handle most catastrophic changes.

Yi DAI boldly questioned one of the fundamental tenets of LISP, that is, the nature of cons cells as the fundamental building block of linked lists. A thrilling discussion about the definition of terms sprung up, and thankfully, it all fizzled out without any blood being spilt.

John Tobey asked an interesting question: Are there any other Cheney-MTA-inspired systems out there? Peter Bex had heard that REBOL did, but couldn't follow it up very far. John Cowan pointed out that continuation-based compilers were out of fashion, but Felix made us feel better by pointing out that they still rocked. John Tobey followed up with a proposal to build better support for tail-calls directly into C, which was met with great interest.

Hans Nowak asked how people's coding workflow worked in Chicken and Scheme in general; the debate is still open, but it's already clear that people have widely differing styles, but everyone loves the REPL!

4. Omelette Recipes

Unusually, I'm not going to talk about a specific egg this week. Instead, I'm going to get back to basics and share some tips for making the best of closures and continuations, which are perhaps the biggest conceptual things new Scheme users have to get their heads around to unleash the language's full power.

Many C programmers will, at some point in their lives, have had to write some code that preserves a list of things that need to happen; event handlers, callbacks, atexit handlers to clear up all your manually-managed resources, that sort of thing. Generally, the list is handled in one of two ways - either a linked list of structs, with a field declaring the type of action to perform and a union of structs containing the "parameters" for each action, or a linked list of structs which have a function pointer and a void* pointing to the parameters for that function, which it then casts into its own, private, struct type.

The obvious thing, when implementing this same pattern in Scheme, is therefore to create a list of records, each of which has a callback closure field and a callback-parameters field:

(define-record todo callback params)

(define foo 123)

(define *todos*
  (list
   (make-todo
    (lambda (args) (printf "Callback: ~a\n" args))
    foo)))

(set! *todos*
      (cons
       (make-todo
	(lambda (args) (printf "Callback 2\n"))
	'())
       *todos*))

(for-each
 (lambda (todo)
   ((todo-callback todo) (todo-params todo)))
 *todos*)

However, as soon as you realise why closures are called closures, you gasp in delight and simplify this by having the closure close over its arguments, rather than needing them clumsily bound to it by the record - meaning you don't even need the record any more, and can just keep a list of closures:

(define foo 123)

(define *todos*
  (list
   (lambda () (printf "Callback: ~a\n" foo))))

(set! *todos*
      (cons
       (lambda () (printf "Callback 2\n"))
       *todos*))

(for-each
 (lambda (todo)
   (todo))
 *todos*)

You pat yourself smartly on the head, as you've now removed the need to define any messy custom record types or anything. What can be more Schemely than a list of closures? But we can be a little smugger than that. Who needs lists when we have lambda? We can define the entire list of callbacks as a single closure, and when we add a new callback, just make our new callback call the previous:

(define foo 123)

(define *todos*
  (lambda () (printf "Callback: ~a\n" foo)))

(set! *todos*
      (let ((old-todos *todos*))
	(lambda ()
	  (printf "Callback 2\n")
	  (old-todos))))

(*todos*)

In this toy example, this gives us little (and even loses us the ability to count the number of callbacks pending in the list), but in larger systems, there is great benefit to the fact that we have simplified the interface: callers of the callback chain no longer need to care if it's a linked list of structures, or closures, or whatever, to know how to invoke it - it's just a closure, and they call it. Which is far harder to get wrong than iterating over some data structure. A closure you call is often the simplest possible interface to anything encapsulating any kind of behaviour!

A more advanced example is fold. When I first encountered fold, I saw it as a tool for abstracting operations over lists, like map, except one that I rarely had much use for, and therefore never remembered the argument order for, and therefore didn't use when I probably should have; it was easier to roll my own recursion each time.

However, the provision of fold-alike procedures as a way of iterating over the entire database in the gdbm egg made me look at fold from a different perspective. I realised that, once a fold procedure had been partially applied to the object it was folding over, it became a way of representing any iterable data structure as a closure. Put simply, anything that can be thought of as a sequence of objects - a list, a vector, a text file full of s-expressions, a random number source - could be represented as a closure taking two arguments; the KONS and KNIL arguments to fold, which could then process the elements of the sequence in turn and produce an answer, without any need to know where the elements are coming from.

There are problems with using this approach - it makes it hard to compute the sum of two sequences and other such sequence-combining operations without tiresome continuation knitting, for example. But it is convenient for a lot of situations, and it neatly demonstrates the way that data can be represented in closures, and how this abstracts the representation away.

Try and think about what data structures in your code could be replaced with simple closures.

Continuations can be rather mind-bending, not least of all because they are indistinguishable from any other closure at first glance. Beginners shold not try and work out what (call/cc call/cc) does! However, the way to master continuations is to start by using them in simple ways. I came across a good demonstration of a practical, easy to understand, use for continuations when writing the interactive extraction interface for Ugarit, which lets one explore a directory structure stored in a Ugarit archive. When listing directories, I wanted to provide the user with a pager that pauses after every twenty lines output, and lets the user abort further listing (returning to the command line) or ask for another page.

Doing this was simple with continuations, even though the actual directory listing came from the Ugarit backend through a fold interface that called my display function for each directory entry in turn. I implemented paging by making my display function increment a counter on every call, and on the twentieth, to display the prompt and wait for user input. That was easy, but as the fold had "control" and would call my callback mercilessly until it had run out of directory entries, how to implement aborting the listing?

The answer was simple with continuations. I used the convenient let/cc wrapper to call-with-current-continuation to capture the escape continuation of the original call to the fold function, and then inside my callback, I invoked that continuation if I wanted to abort early. Somewhat simplified, the code looks like this:

(use miscmacros) ;; Needed for let/cc

(let/cc escape
   (fold-archive-node archive directory-key
      (lambda (node-key dirent row)
	(printf "...display the row...\n")
	(if (> row 20)
	    (begin
	      (printf "-- Press q then enter to stop, or enter for more...\n")
	      (if (string=? (read-line) "q") ; Did they say "q"?
		  (escape (void)) ; If so, escape and return void!
		  0)) ; Otherwise, reset row counter to zero and continue
	    (+ row 1))) ; Increment counter and continue
      0)) ; Initial counter

If you want to try that yourself to see how it works, try prefixing it with the following dummy declarations to stub out the Ugarit dependency:

(define (fold-archive-node a d kons knil)
  (if (zero? a)
      knil
      (fold-archive-node (- a 1) d kons (kons a d knil))))
(define archive 50)
(define directory-key #f)

As you can see, continuations (particularly with the convenient let/cc syntax) can be used to quickly "escape" from within some complex job, like returning from a function in the middle of a for loop in other languages. Continuations can be used to build threading systems, coroutines, exception systems, backtracking, and the like - but leave that to the many fine eggs that do that for you, unless you actually want to write one yourself; the main use for continuations for beginners is to provide quick exits!

5. About the Chicken Gazette

The Gazette is produced weekly by a volunteer from the Chicken community. The latest issue can be found at http://gazette.call-cc.org or you can follow it in your feed reader at http://gazette.call-cc.org/feed.atom. If you'd like to write an issue, check out the instructions and come and find us in #chicken on Freenode!

The chicken image used in the logo is kindly provided and © 2010 by Manfred Wischner