Issue 4

by Peter Bex

2010-09-17 +0000

0. Introduction

Welcome to issue 4 of the Chicken Gazette!

This week's great news was of course the 4.6.0 release of Chicken!

1. Chicken 4.6.0

The new release includes many bugfixes, usability improvements and performance optimisations. Chicken now also runs on yet another platform: Haiku!

Here is a short rundown of things that require changes in your code, to ensure your upgrade goes as smoothly as possible:

There are quite a few procedures that have been deprecated. This means they will be removed entirely in the next release (4.7.0), so it's a good idea to already get rid of those in your code:

Two things that were deprecated in the previous release have now been removed:

Whew, that's a lot of changes! There are lots more, but you can read all about those in the NEWS file.

Users of Chicken on Cygwin should use the 4.6.1 development snapshot instead. There was a small problem in the bootstrapping files which was solved by compiling them using Chicken 4.6.0, but this didn't make it in time for the release. The snapshot contains no new experimental stuff, so it's safe to use.

2. The Hatching Farm - New Eggs & The Egg Repository

This week, Ivan Raikov has released a new egg called iexpr, which is an implementation of the indentation-sensitive alternative syntax for s-expressions described in SRFI-49. Now you can convince all your Pythonic friends and other parenthophobes to give Chicken a try!

Furthermore, Ivan has also converted more of his eggs' documentation from the old eggdoc format to the wiki. Well done, Ivan!

This week's Gazette editor, Peter Bex, has fixed a bug in uri-common which caused some servers to reject requests with parameters containing certain non-alphanumeric characters. He's also fixed a potential race condition in Spiffy that could cause it to stop accepting incoming requests after a while.

Last week's Gazette editor, Alaric Snell-Pym, has picked up work on his Ugarit backup library/program again, fixing a few bugs and working on a way to get rid of the big list of required eggs.

And last but not least, Mario Domenech Goulart has made a few bugfixes and enhancements in autoform-postgresql, awful, salmonella and spiffy-request-vars.

3. The Core - Bleeding Edge Development

A new branch called pointer-vectors has been created by Felix. This branch is intended to add a new data structure that describes a vector type much like u8vector but containing foreign C-pointers instead of bytes. This maps to the C type of void ** in the FFI.

Several people have graciously tested the make-refactoring branch and provided feedback. This resulted in two bugs being found that were also present in the original code: one linking problem on Cygwin, one problem with incorrect nursery size on Cygwin and one problem with u8vectors on PowerPC platforms. These bugs have all been fixed in experimental and merged into make-refactoring.

Thanks to everyone who has tested this branch! We could still use some test results from Solaris and some more testing on Windows and Haiku though, so please help out if you are running Chicken on one of those platforms! If you're unsure on how to test or need help, please write to the mailinglist.

4. Chicken Talk

It was relatively quiet on the mailinglist this week, with only two threads, aside from the release announcement and cywin update:

The first was a thread about autoloading of optional egg dependencies, started by Alaric Snell-Pym. He also observed a slightly annoying behaviour in how Chicken handles autoloaded imports, which resulted in a bugreport after Alex Shinn chimed in to suggest a solution to get rid of this behaviour.

The other was another thread about packaging eggs by Jim Pryor, who reported some problems and inconsistencies in a few eggs. These have now been fixed, which should make it easier for others who wish to make egg packages for their operating system.

In other news, Christian Kellermann has converted his weblog from the Perl-powered Blosxom to the Chicken-powered Hyde, the software that is also used to publish this Gazette. Perhaps he'll blog about how the conversion was done (hint, hint :P)

5. Omelette Recipes - Tips and Tricks

This week's featured egg is the iset egg ("iset" is short for "integer set"). It is a very useful egg for when you need to keep around arbitrarily large sets of integers. For example, in the 9p egg I used iset's bit-vectors to keep track of open file descriptor numbers that were passed to a 9p server. For another use case, see the utf8 egg, which uses isets to represent SRFI-14 character sets on the full Unicode range, storing the integer value of each character.

How does it work? First of all, this egg actually provides two APIs. You'll need to determine which of these you want to use: full integer sets or simple bit-vectors.

If your integer sets are either sparse (have big gaps in them) or dense (have large full ranges of integers in them), or you need to perform a lot of set algebra or want to use higher order functions like "map" and "fold" on them, you probably want to go with isets.

If in doubt, go with isets. They are implemented on top of bit-vectors and the overhead they add is not that big. But when you are using bit-vectors when you should be using isets, you pay for it dearly in terms of efficiency. But we'll start with explaining bit-vectors first.

You could regard bit-vectors as a sort of extension of the basic srfi-4 vector types. They have slightly different semantics, however. Where u8vectors have u8vector-ref which returns a number representing the numberic byte value at that position, bit-vectors have bit-vector-ref which returns #t if the bit is set and #f when it is not set. Also, very importantly, they provide a functional API, so no messing around with exclamation-mark procedures, unless you really want to!

Bit-vectors also come with a set of operations that u8vectors do not have, but which you would typically use on integers when using them for their bit-values:

  (use iset)
  (define v (make-bit-vector 3)) ; 3 is a hint how many bits we want to store
  
  (bit-vector-ref v 1) => #f
  (set! v (bit-vector-set v 1 #t))
  (bit-vector-ref v 1) => #t
  (set! v (bit-vector-shift v 2))  ; Shift left by 2
  (bit-vector-ref v 3) => #t
  (set! v (bit-vector-shift v -1)) ; Shift right by 1
  (bit-vector-ref v 2) => #t
  
  ;; Quickly initialize a bitvector with an integer's bitwise representation:
  (define v2 (integer->bit-vector #b0010))
  
  ;; Inclusive or:
  (define v3 (bit-vector-ior v v2))
  (bit-vector-ref v3 0) => #f
  (bit-vector-ref v3 1) => #t
  (bit-vector-ref v3 2) => #t
  
  ;; The vector's highest bit that's set is the 3rd bit (index 2):
  (bit-vector-length v3) => 3
  
  ;; Are all bits from 0 till 2 (the three lowest bits) set?
  (bit-vector-full? v3 3) => #f
  
  ;; No? Oh right, it has only two bits set:
  (bit-vector-count v3) => 2
  
  (set! v3 (bit-vector-set v3 0 #t))
  ;; *now* all bits up to 3 are set
  (bit-vector-full? v3 3) => #t

As you can see, this is a useful way to work with collections of bits, and since it can handle sets with arbitrarily large bit indices, you can use this as an alternative to the bitwise operations from the numbers egg, which is a rather large dependency to pull in if you just want large sets of bits.

The iset API is modeled after the SRFI-14 API, but provides many more operations:

  (use iset)
  (define i (iset 1 2 3))
  
  ;; Higher-order procedures:
  (iset-any (lambda (x) (and (even? x) x)) i) => 2
  (iset-every positive? i) => #t
  (iset-fold + 0 i) => 6
  
  (define (double x) (+ x x))
  (define i2 (iset-map double i))
  (iset->list i2) => (2 4 6)
  
  ;; Adding/removing integers:
  (iset->list (iset-adjoin i2 1 3)) => (1 2 3 4 6)
  (iset->list (iset-delete i2 2 6)) => (4)
  
  ;; Set algebra:
  (iset->list (iset-difference i i2)) => (1 3)
  (iset->list (iset-union i i2)) => (1 2 3 4 6)
  (iset->list (iset-intersection i i2)) => (2)
  (call-with-values (lambda () (iset-diff+intersection i i2))
                    (lambda (d i)
                      (list diff: (iset->list d)
                            intersection: (iset->list i))))
   => (diff: (1 3) intersection: (2))
  
  ;; For efficiency, you can also use exclamation-mark versions:
  (iset->list (iset-difference! i i2)) => (1 3)
  ;; But i *might* be modified now, so we cannot reuse it

I hope this gave you a good feel for how natural it is to represent integer sets with the iset egg, and what tradeoffs there are in choosing bit-vectors versus isets.

6. About the Chicken Gazette

The Gazette is produced weekly by a volunteer from the Chicken community. The latest issue can be found at http://gazette.call-cc.org or you can follow it in your feed reader at http://gazette.call-cc.org/feed.atom. If you'd like to write an issue, check out the instructions and come and find us in #chicken on Freenode!

The chicken image used in the logo is kindly provided and © 2010 by Manfred Wischner