A merry gang

Last Wednesday, FLIB’s kick-off meeting took place at Oblong’s Barcelona lab, with a dozen deliciously crazy people attending.

Due to lack of time and seriousness on my side, the main talk was given by Jos, who gave us an overview of his work with PLT Redex to model lambda calculus. Redex provides an embedded DSL to create context-sensitive term-rewriting systems, if you’ll pardon my buzzwording. In a hand-waving nutshell, term-rewriting systems are syntax-rules on steroids: one specifies a set of rules for transforming (rewriting, or reducing) terms to other terms according to their structure, possibly depending on context. Jos has a very nice example of such a system taken from GEB (Best. Book. Ever.), the MIU formal system, whose formal rules can be expressed in Redex as:

  (--> (‹symbol›  ... I) (‹symbol› ... I U))
  (--> (M ‹symbol› ..) (M ‹symbol› ... ‹symbol› ...))
  (--> (‹symbol›_0 ... I I I ‹symbol›_1 ...) (‹symbol›_0 ... U ‹symbol›_1 ...))

that is, if you find a trailing I, you can append U; if you find M, you can duplicate the rest of the string; and three consecutive Is can be reduced to a single U. Now, you can start with a given string (or axiom) and apply the rules to produce new ones (theorems). Note how the rules are contextual, and how there’s in general more than one that is applicable. Redex will do that for you, creating a tree with all possible reductions.

MIU reductions in Redex

Of course, there’s more to Redex than this simple example. For instance, it’s been used to provide an operational semantics for R6RS. Jos’ work is somewhere in the middle: while the reduction rules in lambda calculus are even simpler than in MIU, issues of scope quickly complicate things; moreover, Jos explores classical topics in lambda calculus, such as reduction to normal form, fixed point combinators or Church numerals to name a few, always using Redex (the staggering conceptual richness embodied by the humble premises of lambda calculus always amazes me). All in all, a beautiful 32-pages long paper, with accompanying code, that serves as a nice hands-on introduction to both lambda calculus and Redex, and which you can get at Jos homepage.

After the presentation, we devoted some time to talk about the future of FLIB. Monthly presentations, lightning talks on demand and a reading group. I like the latter a lot, because having a physical meeting among readers every month is an excellent way of keeping reading groups alive. We’re still deciding on our first book, but PLAI followed by LiSP seems to be gaining momentum right now. Another nice thing about the reading group is that it opens the possibility for people not able to come to the meetings to participate: just subscribe to our mailing list and join the discussions about the book du jour.

And then we just sat down around a table with some beer and snacks, and start talking about life and programming languages. I found it very stimulating because of the varied people’s backgrounds: we had guys from academia and industry; ones just starting their graduate courses, others with twenty years of teaching under their belts; people from several different countries; schemers obsessed with call/cc, smalltalkers, python experts, C++ loathers and programmers who secretly enjoy it, perlmongers and ruby or haskell aficionados. But, they all, people with a passion for programming: i think everybody was happy to have found a bunch of keen souls.

Perhaps it was inevitable that much of the discussion gravitated around our frustrations as programmers and teachers, given the sad state of computer science in both industry and academia, and the insurmountable barriers for adoption faced by the kind of languages we like. But, with that out of the way, here’s hope (as expressed by Andy after the meeting) that future meetings will concentrate on brighter fields.

A great evening, and no mistake. I hope you’ll join the fun next July 22nd!

flib

Update We’ve moved the date of our first meeting to June 17th, so you’re still in time to join us! If you want to follow our adventures, you can also ask for an invitation to our mailing list.

The other day, Andy and I met Jos, an experienced schemer who lives near Barcelona, with the idea of having lunch, talking about Scheme, and create a Scheme Users Group. After a bit of discussion, we agreed on widen the group’s scope, and start what we’re calling Fringe Languages In Barcelona (FLIB). The plan is to conduct periodic meetings with a main presentation followed by some lightning talks (the latter were a complete success at ILC, and we’d like to try and see how they work for us), with as much discussion interleaved as we see fit. We’ll have some refreshments available and, since we’re meeting in the very center of the old city, visits to pubs or a restaurant for dinner and further socializing are to be expected.

As i said, we’re expecting much discussion about Scheme and Lisp, but we’re not ruling out by any means other fine languages. For instance, the talk for the inaugural session (scheduled June 10th17th, 7:30 pm) is entitled The implementation of FUEL, Factor’s Ultimate Emacs Library, and it will include a short introduction to Factor (yes, i am the victim speaker). Jos will come next, the same day, with a lightning talk about PLT Redex. We have free slots for more lighting talks: you are invited not only to come, but to give one if you’re so inclined. This being our first meeting, there will be also some time for logistics and organisation.

So, if you’re near here by then, by all means, come in and join the fun:

Calle del Pi 3 Principal Interior (first floor)
Barcelona

Not really needed, but if you’re thinking about coming, sending me a mail beforehand will help us to be sure that we’ve got enough food and drinks.

We’re looking forward to getting FLIB started, and we’re sure that at least grix more fringers are coming! Don’t miss it!

Reinventing programming

Alan Kay hardly needs a presentation, so i won’t waste your time before pointing out to his latest interview, where he talks with Allan E. Alter about the current computing landscape. As you may expect from a visionary such as Kay, he is not exactly happy with what he sees, and is currently working in his Viewpoints Research Institute to try and invent the future of programming. Besides his involvement in the “One Laptop per Child” project, Kay and coworkers have recently been awarded a NFS grant to develop their ideas on how a better programming platform should be. If you’re curious (and who would not!), you can read some of the details of their amazing plans in the proposal they submitted to the NFS: Steps Towards the Reinvention of Programming. This proposal for the future starts by trying to recover the best from the past, particularly the seemingly forgotten ideas of another visionary, Doug Engelbart. As Kay rightly points out during the interview,

[Most of those ideas] were written down 40 years ago by Engelbart. But in the last few years I’ve been asking computer scientists and programmers whether they’ve ever typed E-N-G-E-L-B-A-R-T into Google-and none of them have. I don’t think you could find a physicist who has not gone back and tried to find out what Newton actually did. It’s unimaginable. Yet the computing profession acts as if there isn’t anything to learn from the past, so most people haven’t gone back and referenced what Engelbart thought.

The reinventing programming project tries to change this situation with some interesting proposals. Their envisioned system would put forward the lessons drawn from Squeak and Etoys towards the creation of a fully introspective environment which can be understood completely by its users; actually, a system which guides programmers to full disclosure of its innards. In Kay and coworkers’ words:

This anticipates one of the 21st century destinies for personal computing: a real computer literacy that is analogous to the reading and writing fluencies of print literacy, where all users will be able to understand and make ideas from dynamic computer representations. This will require a new approach to programming. [...] This will eventually require this system to go beyond being reflective to being introspective via a self-ontology. This can be done gradually without interfering with the rest of the implementation.

So, simplicity is key, and they purport to write such a system in a mere 20K LOC. To that end, they propose a sort of great unification theory of particles (homogeneous, extensible objects) and fields (the messages exchanged by myriad objects)—well, yes, it’s just a metaphor, but you can see it in action in the paper, applied to images and animations. The report also explains how the physical metaphor is completed with a proper simulation of the concept of time. As for introspection, inspiration comes, quite naturally, from Lisp:

What was wonderful about this [John McCarthy's] approach is that it was incredibly powerful and wide-ranging, yet was tiny, and only had one or two points of failure which would cause all of it to “fail fast” if the reasoning was faulty. Or, if the reasoning was OK, then the result would be a very quick whole system of great expressive power. (John’s reasoning was OK.) In the early 70s two of us used this approach to get the first version of Smalltalk going in just a few weeks: one of us did what John did, but with objects, and the other did what Steve Russell did. The result was a new powerful wide-ranging programming language and system seemingly by magic.

Albert bootstrappingLest anyone thinks that all of this is just a loosely knitted bag of metaphors and wishful thinking, the report gives some technical detail on an actual implementation of some of these ideas. Albert is a bootstrapper that is able in a few hundreds of lines of code to make a kernel for Squeak that runs nine times faster than existing interpreters. The bootstrap process looks fascinating:

A disposable compiler (written in C++) implements a simple message-passing object-oriented language in which a specification-based Object compiler (implementing the same language) is implemented. The system is now self-implementing but still static. A dynamic expression compiler/evaluator is then implemented using the static compiler and used to replace the static messaging mechanisms with dynamic equivalents. The system is now self-describing and dynamic ­ hence pervasively late-bound: its entire implementation is visible to, and dynamically modifiable by, the end user.

Again, the proposal gives a bit more detail, but i’m not sure i’m understanding it fully: if anyone knows if/where Albert’s code is available, please chime in!

Not that i agree 100% with all the ideas in the report (and, as i said, there’re quite a few i don’t fully grasp), and i’m sure most of you won’t agree with everything either. But it’s definitely worth the effort reading, trying to understand and mulling over Alan Kay’s vision of the future of programming. He knows a bit about these things.

Update: Thanks to Glenn Ehrlich, who in a comment below provides links to learn more about Ian Piumarta’s Albert, also known as Cola/Coke/Pepsi.

Smalltalk daily

VisualworksI’m sure this is old hat for all Smalltalk practitioners, or anybody with Planet Smalltalk in her feed list, but here’s a great way of learning a bit more about one of the most elegant programming languages ever:Smalltalk daily. James Robertson, Cincom Smalltalk‘s Product Manager, is one of the few guys around with the word manager in their job title that knows how to program; and he’s sharing his knowledge on a daily basis in the screencast series linked above. He started last month, and has managed to live up his self-imposed, tight scheduled of one post per day. Naturally enough, he uses Visual Works for his demos, but many of the videos are about generic Smalltalk features, available in other implementationa.For instance, see his explanation on Smalltalk variables and inspectors, or this review on classes and inheritance. (As an aside, there’s also an interesting podcast series on Smalltalk conducted by James… i wonder where he finds the time!)

DolphinSmalltalk environments are one of the most amenable to video demonstrations: the whole object graph of your running environment is there for you to explore in real time, and there’s lots of interesting ways to interact with it [0]. As a matter of fact, everybody seems to want to show you his favorite Smalltalk: see for instance Stefan Ducasse’s collection of Squeak videos or this nice flash demo of how to do TDD in Dolphin Smalltalk (one of the few Windows programs i wish i had in Mac or Linux).

I was trying to imagine an equivalent video on C or Java development and couldn’t stop yawning ;-). Enjoy!


[0] See this blog entry by Cees de Grot for an enthralling account of Smalltalk’s lifeliness.

Moo-oriented programming

I rarely play computer games, probably because i find programming so fun (or maybe just because i’m a dull boy). Therefore, this recent Wired News article was a total surprise to me. It reviews playsh, a pretty interesting collaborative programming environment.

If you’re not as dull as i am, you’ll already know about MUDs and MOOs, the popular distributed role-playing platforms. Playsh substitutes the grues and spells of your typical MOO by whatever program code you’re working on. Actually, not only you, but any other coder connected to your server. The idea of wondering around rooms where you find your programs objects and APIs and pick and modify them possibly in collaboration with other programmers in the room is quite amusing. We are just taking the living environment of dynamic languages one step further, making it collaborative.

Playsh is Matt Webb‘s (of Mind Hacks fame) brainchild, and he has already released a Python-based proof-of-concept implementation developed in collaboration with Ben Cerveny. Documentation is still scarce, but this blog entry of Matt’s gives a good overview of his goals and plans for the future, and there’s also some info around on Matt and Ben’s recent Etech session on playsh. Also worth reading are Matt’s ideas on the history of physics and the future of computing and his essay on modernity and protest, which provide the intellectual background that has led Matt to playsh.

Playsh depens on quite a few external modules and installing it is currently a bit of a chore, but if you get it running you’ll have a text-based interface that allows coding network objects (accessed via standard web protocols like RSS or HTTP) as you wander around its MUD-like rooms together with any other player, er, programmer. Each object you encounter has a set of shared properties (akin to instance variables) and verbs (methods on them), the latter being user specific. To modify and add verbs or objects, one drops from the navigation environment into and interactive Python interpreter. The nifty thing is that not only the objects, but the interpreter itself is shared among programmers: you can see your colleagues typing their code and have your say on it!

Interesting as they are, these ideas are by no means completely new, as any Croquet user will tell you. But still.

Technorati Tags: , , , ,

Programmers go bananas

Introduction: lists galore

I learned programming backwards, plunging right on into C and, shortly after, C++ and Java from the very beginning. I was knee deep in complex data structures, pointers and abstruse template syntax in no time. And the more complex it all felt, the more i thought i was learning. Of course, i was clueless.

Reading SICP and learning about functional programming changed it all. There were many occasions for revelation and awe, but one of my most vivid recollections of that time is my gradual discovery of the power of simplicity. At about half way into SICP i realised in wonder that that beautiful and powerful world was apparently being constructed out of extremely simple pieces. Basically, everything was a list. Of course there were other important ingredients, like procedures as first-class objects, but the humble list was about the only data structure to be seen. After mulling on it for a little bit, i saw where lists draw their power from: recursion. As you know, lists are data types recursively defined: a list is either the empty list or an element (its head) followed by another list (its tail):

list = []
list = a : list

where i’m borrowing Haskell’s notation for the empty list ([]) and the list constructor (:), also known by lispers as () and cons. So that was the trick, i thought: lists have recursion built-in, so to speak, and once you’ve read a little bit about functional programming you don’t need to be sold on the power and beauty of recursive programs.

It is often the case that powerful and beautiful yet simple constructs have a solid mathematical foundation, and only when you grasp it do you really realize how powerful, beautiful and amazingly simple that innocent-looking construct is. Lists, and recursive operations on them, are an excellent case in point. But the path connecting them to their mathematical underpinnings is a long and winding one, which lays in the realm of Category Theory.

I first became acquainted of the relationship between categories and recursive programming reading Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire, by Erik Meijer, Maarten Fokkinga and Ross Paterson. Albeit very enjoyable, this paper presupposes a high degree of mathematical sophistication on the reader’s side. I will try in this article to give you a simplified overview of the concepts involved, including Category Theory, its application to programming languages and what funny names like catamorphism, anamorphism or lambda-lifting have to do with your everyday list manipulations. Of course, i’ll be only scratching the surface: interspersed links and the Further reading section provide pointers to more in-depth explorations of this wonderland.

Categorical interlude

CategoriesCategories are (relatively) simple constructs. A category consists of a set O of objects, and a set A of arrows between elements of O. Arrows are composable: if there’s an arrow from a to b, and another one from b to c, there must be an arrow from a to c (where a, b and c are elements of O). Besides, they are associative: if you have arrows from a to b, b to c, and c to d, you can go from a to d via two different paths, namely, first from a to c and then from c to d, or first from a to b and then from b to d. Finally, for each element a in O there’s an identity arrow which goes from a to itself (called an identity), such that following this arrow changes nothing. These properties are better visualized with a diagram (or a bit of mathematical notation), as shown in the image on the right.

A category captures a mathematical world of objects and their relationships. The canonical example of a category is Set, which contains, as objects, (finite) sets and, as arrows, (total) functions between them. But categories go far beyond modeling sets. For instance, one can define a category whose objects are natural numbers, and the ‘arrows’ are provided by the relation “less or equal” (that is, we say that there is an arrow joining two numbers a and b if a is less or equal than b). What we are trying to do with such a definition is to somehow capture the essence of ordered sets: not only integers are ordered but also dates, lemmings on a row, a rock’s trajectory or the types of the Smalltalk class hierarchy. In order to abstract what all those categories have in common we need a way to go from one category to another preserving the shared structure in the process. We need what the mathematicians call an isomorphism, which is the technically precise manner of stating that two systems are, in a deep sense, analogous; this searching for commonality amounts to looking for concepts or abstractions, which is what mathematics and (good) programming is all about (and, arguably, intelligence itself, if you are to believe, for instance, Douglas Hofstadter‘s ideas).

To boot, our definition of a category already contains the concept of isomorphic objects. Think of an arrow from a to b as an operation that transforms a in b. An arrow from b to a will make the inverse transformation. If composing both transformations gives you the identity, you are back to the very same object a, and we say that a and b are isomorphic: you can transform one into the other and back at will. In a deep sense, this concept captures a generic way of expressing equality that pervades all maths: if you’re not afraid of a little bit of maths, Barry Mazur‘s essay When is a thing equal to some other thing? is an excellent introduction to Category Theory with an emphasis in the concept of equality. Among many other things, you will learn how the familiar natural numbers can be understood as a category, or how an object is completely defined by the set of its transformations (and, therefore, how to actually get rid of objects and talk only of transformations; i know this is stretching and mixing metaphors (if not plain silly), but this stress in arrows, as opposed to objects, reminded me of Alan Kay’s insistence on focusing on messages rather than objects). Another introductory article with emphasis on categories as a means to capture sameness is R. Brown and T. Porter’s Category Theory: an abstract setting for analogy and comparison.

Not only objects inside a category can be transformed into each other. We reveal the common structure of two disjoint categories by means of a functor mapping across two categories. A functor consists of two functions: one that maps each object of the first category to an object in the second, and another one putting in correspondence arrows in one category with arrows in the second. Besides, these functions must preserve arrow composition. Let me spell this mathematically. Consider to categories, C and C’ with object sets O and O’ and arrow sets A and A’. A functor F mapping C to C’ will consist then of two functions (Fo, Fa); the first one taking elements of O to elements of O’:

Fo: O -> O’

Fo(a) in O’ for every a in O

and the second one taking arrows from A to arrows in A’:

Fa: A -> A’

Fa(f) in A’ for every f in A

and such that, if f is an arrow from a to b in C, Fa(f) is an arrow from Fo(a) to Fo(b) in C’. Moreover, we want that following arrows in C is ‘analogous’ to following them in C’, i.e., we demand that

Fa(fg) = Fa(f)Fa(g)

In the left hand side above, we are composing two arrows in C and then going to C’, while in the right hand side we first take each arrow to C’ and, afterwards, compose them in there. If C and C’ have the same structure, these two operations must be equivalent. Finally, F must preserve identities: if i is the identity arrow for an element a in O, Fa(i)must be the identity arrow for Fo(a) in O’. The diagram on the left shows a partial graph (i’m not drawing the identity arrows and their mappings) of a simple functor between two categories, and ways of going from an object a in the first category to an object x in the second one which are equivalent thanks to the functor’s properties.

As you can see in this simple example, the functor gives us the ability of seeing the first category as a part of the second one. You get a category isomorphism in the same way as between objects, i.e., by demanding the existence of a second functor from C’ to C (you can convince yourself that such a functor does not exist in our example, and, therefore, that the two categories in the diagram are not isomorphic).

You have probably guessed by now one nifty property of functors: they let us going meta and define a category whose objects are categories and whose arrows are functors. Actually, Eilenberg and MacLane‘s seminal paper General theory of natural transformations used functors and categories of categories to introduce for the first time categories (natural transformations are structure-preserving maps between functors: this Wikipedia article gives an excellent overview on them).

But enough maths for now: it is high time to show you how this rather abstract concepts find their place in our main interest, programming languages.

Categories and programming languages

About the only similarity between C and Haskell programming is that one spends a lot of time typing ASCII arrows. But of course, Haskell’s are much more interesting: you use them to declare the type of a function, as in

floor:: Real -> Int

The above stanza declares a function that takes an argument of type real and returns an integer. In general, a function taking a single argument is declared in Haskell following the pattern

fun:: a -> b

where a and b are types. Does this ring a bell? Sure it does: if we identify Haskell’s arrows with categorical ones, the language types could be the objects of a category. As we have seen, we need identities

id:: a -> a
id x = x

and arrow composition, which in Haskell is denoted by a dot

f:: b -> c
g:: a -> b
fg:: a -> b -> c
fg = f . g

Besides, associativity of arrow composition is ensured by Haskell’s referential transparency (no side-effects: if you preserve referential transparency by writing side-effect free functions, it won’t matter the order in which you call them): we’ve got our category. Of course, you don’t need Haskell, or a statically typed language for that matter: any strongly typed programming language can be modelled as a category, using as objects its types and as arrows its functions of arity one. It just happens that Haskell’s syntax is particularly convenient, but one can define function composition easily in any decent language; for instance in Scheme one would have

(define (compose f g) (lambda (x) (f (g x)))

Functions with more than one arguments can be taken into the picture by means of currying: instead of writing a function of, say, 2 arguments:

(define (add x y) (+ x y))
(add 3 4)

you define a function which takes one argument (x) and returns a function which, in turn, takes one argument (y) and returns the final result:

(define (add x) (lambda (y) (+ x y)))
((add 3) 4)

Again, Haskell offers a pretty convenient syntax. In Haskell, you can define add as:

add x y = x + y

which gets assigned, when applied to integers, the following type:

add:: Int -> (Int -> Int)

that is, add is not a function from pairs of integers to integers, but a function that takes an integer and returns a function of type Int -> Int. Finally, we can also deal with functions taking no arguments and constant values by introducing a special type, called unit or 1 (or void in C-ish), which has a unique value (spelled () in Haskell). Constants of our language (as, e.g., True or 43.23) are then represented by arrows from 1 to the constant’s type; for instance, True is an 1 -> Boolean arrow. The unit type is an example of what in category theory is known as a terminal object.

Now that we have successfully modelled our (functional) programming language as a category (call it C), we can use the tools of the theory to explore and reason about the language constructs and properties. For instance, functors will let me recover the original motivation of this post and explore lists and functions on them from the point of view of category theory. If our language provides the ability to create lists, its category will contain objects (types) of the ‘list of’ kind; e.g. [Int] for lists of integers, [Boolean] for lists of Booleans and so on. In fact, we can construct a new sub-category CL by considering list types as its objects and functions taking and returning lists as its arrows. For each type a we have a way of constructing a type, [a] in the sub-category, i.e., we have a map from objects in C to objects in CL. That’s already half a functor: to complete it we need a map from functions in C to functions in CL. In other words, we need a way to transform a function acting on values of a given type to a function acting on lists of values of the same type. Using the notation of the previous section:

Fo(a) = [a]
Fa(f: a -> b) = f': [a] -> [b]

Fa is better known as map in most programming languages. We call the process of going from f to f' lifting (not to be confused with a related, but not identical, process known as lambda lifting), and it’s usually pretty easy to write an operator that lifts a function to a new one in CL: for instance in Scheme we would write:

(define (lift f) (lambda (lst) (map f lst)))

and for lift to truly define a functor we need that it behaves well respect to function composition:

(lift (compose f g)) = (compose (lift f) (lift g))

We can convince ourselves that this property actually holds by means of a simple example. Consider the function next which takes an integer to its successor; its lifting (lift next) will map a list of integers to a list of their successors. We can also define prev and (lift prev) mapping (lists of) integers to (lists of) their predecessors. (compose next prev) is just the identity, and, therefore, (lift (compose next prev)) is the identity too (with lifted signature). But we obtain the same function if we compose (lift next) and (lift prev) in CL, right? As before, there’s nothing specific to Scheme in this discussion. Haskell even has a Functor type class capturing these ideas. The class defines a generic lift operation, called fmap that, actually, generalizes our list lifting to arbitrary type constructors:

fmap :: (a -> b) -> (f a -> f b)

where f a is the new type constructed from a. In our previous discussion, f a = [a], but if your language gives you a way of constructing, say, tuples, you can lift functions on given types to functions on tuples of those types, and repeat the process with any other type constructor at your disposal. The only condition to name it a functor, is that identities are mapped to identities and composition is preserved:

fmap id = id
fmap (p . q) = (fmap p) . (fmap q)

I won’t cover usage of type constructors (and their associated functors) other than lists, but just mention a couple of them: monads, another paradigmatic one beautifully (and categorically) discussed by Stefan Klinger in his Programmer’s Guide to the IO Monad – Don’t Panic (also discussed at LtU), and the creation of a dance and music library, for those of you looking for practical applications.

To be continued…

Returning to lists, what the lifting and categorical description above buys us is a way to formalize our intuitions about list operations, and to transfer procedures and patterns on simple types to lists. In SICP’s section on Sequences as conventional interfaces, you will find a hands-on, non-mathematical dissection of the basic building blocks into which any list operation can be decomposed: enumerations, accumulators, filters and maps. Our next step, following the bananas article i mentioned at the beginning, will be to use the language of category theory to provide a similar decomposition, but this time we will talk about catamorphisms, anamorphisms and similarly funny named mathematical beasts. What the new approach will buy us is the ability to generalize our findings beyond the list domain and onto the so-called algebraic types. But this will be the theme of a forthcoming post. Stay tunned.

Further reading

The best introductory text on Category Theory i’ve read is Conceptual Mathematics : A First Introduction to Categories by F. William Lawvere (one of the fathers of Category Theory) and Stephen Hoel Schanuel. It assumes no sophisticated mathematical background, yet it covers lots of ground. If you feel at home with maths, the best option is to learn from the horse’s mouth and get a copy of Categories for the Working Mathematician, by Mac Lane.

The books above do not deal with applications to Computer Science, though. For that, the canonical reference is Benjamin Pierce’s Basic Category Theory for Computer Scientists, but i find it too short and boring: a far better choice is, in my opinion, Barr and Well’s Category Theory and Computer Science. A reduced version of Barr and Well’s book is available online in the site for their course Introduction to Category Theory. They are also the authors of the freely available Toposes, Triples and Theories, which will teach you everything about monads, and then more. Marteen Fokkinga is the author of this 80-pages Gentle Introduction to Category Theory, with a stress on the calculational and algorithmical aspects of the theory. Unless you have a good mathematical background, you should probably take gentle with a bit of salt.

Let me close by mentioning a couple of fun applications of Category Theory, for those of you that know a bit about it. Haskell programmers will like this implementation of fold and unfold (as a literate program) using F-(co)algebras and applied to automata creation, while those of you with a soft spot for Physics may be interested in John Baez’s musings on Quantum Mechanics and Categories.

Tags: , , , ,

The clones strike back

Back at Ozone‘s you can find a crash introduction to Io, a relatively new prototype-based language. As explained in its official site:

Io is a small, prototype-based programming language. The ideas in Io are mostly inspired by Smalltalk (all values are objects), Self (prototype-based), NewtonScript (differential inheritance), Act1 (actors and futures for concurrency), LISP (code is a runtime inspectable/modifiable tree) and Lua small, embeddable).

Besides running in all the usual platforms, and some not so usual ones like Symbian and Syllable, it offers an interesting set of libraries including sockets, databases, OpenGL, some crypto APIs or, notably, and Objective-C bridge.

Io’s syntax, if anything, is smalltalkish, and claims to be as simple as it takes: it has no keywords! It also features decent performance, sometimes not much worse that Python’s or Ruby’s.

If you’re interested in prototype-based languages, and want to try something simpler than Slate or newer than Self, Io looks like an option worth considering. Besides, applications like this one seem to point to a relatively mature language.

On a loosely related note, schemers interested in prototypes (like myself) may find Jorgen Schäfer’s Prometheus an excellent way to get acquainted with this fascinating subject, and maybe spend a couple of fun evenings implementing selfish patterns (as explained in the article by Brian Foote that gives name to this post) in Scheme.

Happy cloning!

Technorati Tags: , , ,

Dynamic languages day

The presentations and videos of the Dynamic Languages Day held last month at Vrije Universiteit are available at the event’s web site, which also counted with the collaboration of the Belgian Association for Dynamic Languages, whose home page starts with a call to arms: No Java, C++, Pascal or C here!

Prof. Viviane Jonckers gives an overview on the university’s use of Scheme as the lingua franca for most computer science courses in the first two years, stressing how beneficial to the students is an early exposure to the dynamic paradigm (PDF / Quicktime). A good quick overview of the features that make Scheme different.

No dynamic languages discussion could oversee Smalltalk. Johan Brichau and Roel Wuyts talked about the language’s features (PDF), including its metaobject protocol and flexible development environment, which so nicely puts to good use the dynamic nature of Smalltalk (as demonstrated in the video).

Ellen Van Pasesschen
reviewed the history and main traits of prototype-oriented languages before entering in a tutorial on Self’s strengths and efficiency issues (you’ll be surprised). Her presentation is real fun, and worth to see live. Ellen uses Self extensively in her Ph.D. dissertation, which includes a sophisticated interactive environment (called SelfSync) that creates object networks from Entity-Relationship diagrams, and keeps two-way synchronisation between objects and their ER model: just take a look at this demo to see what i mean.

Finally, Pascal Costanza gave a life demonstration of Common Lisp’s Object System and its metaobject protocol, and how it can be used to implement Aspect-Oriented-like features into Lisp. The talk is targetted at non-specialists, and is full of side-notes explaining CL idiosincrasies and questions from the audience, so that you can learn some Lisp on the way even if it’s not your language. The live part of the talk includes implementing a Python-like object system using the MOP, so be sure to watch the video in addition to downloading  the PDF slides and source code. If you know a little bit about CLOS, or get impressed enough with Pascal’s wizardry, don’t miss Closer, a set of libraries written by Pascal that take CL implementations closer to the MOP specification, and applies it to AOP and what Pascal calls Context-oriented Programming, or ContextL. Since i have yet to really understand what exactly are ContextL’s goals, i’d better just point you to the overview paper i’m currently reading.

Wow, i wish i was there!

Technorati Tags: , , , , , , , ,

Beyond mainstream object-oriented programming

Introduction

After a few scheming years, i had come to view objects as little more than poor-man closures. Rolling a simple (or not so simple) object system in scheme is almost a textbook exercise. Once you’ve got statically scoped, first-order procedures, you don’t need no built-in objects. That said, it is not that object-oriented programming is not useful; at least in my case, i find myself often implementing applications in terms of a collection of procedures acting on requisite data structures. But, if we restrict ourselves to single-dispatch object oriented languages, i saw little reason to use any of them instead of my beloved Scheme.

Things started to change recently due to my discovering the pleasures of Smalltalk. First and foremost, it offers a truly empowering integrated ambient to live and develop in. Second, if you’re going to use objects, using the simplest, cleanest syntax will not hurt. Add to that some reading on the beautiful design principles underlying Smalltalk, and one begins to wonder if closures aren’t, in fact, poor-man objects–or at least i do, whenever i fall in an object-oriented mood (i guess i’m yet not ready to reach satori).

But Scheme is not precisely an ugly or bad designed language, so i needed some other reason to switch language gears for my OO programming. I knew there’s more than encapsulation or subtype polymorphism in object-land from my readings on CLOS (the Common Lisp Object System), or on Haskell’s type classes (and its built-in parametric polymorphism), but i was after something retaining Smalltalk’s elegance. And then i remembered that, when i was a regular lurker in the Tunes project‘s mailing lists and IRC channel, a couple of smart guys were implementing an OO language whose syntax was smalltalkish. That language (which, if memory servers, started life with the fun name who me?) has evolved during the last few years into a quite usable programming environment named Slate, started by Lee Salzman and currently developed and maintained by Brian Rice.

I’ve been reading about Slate during the last few days, and decided to learn it. What motivated me was discovering how Slate goes beyond mainstream object-oriented programming by incorporating well-known (but hardly used) and really powerful paradigms. In short, Slate improves Smalltalk’s single-dispatch model by introducing and combining two apparently incompatible technologies: multiple dispatch and prototype-based programming. To understand the whys and hows of Slate, there’s hardly a better way than reading Lee Salzman’s Prototypes with Multiple Dispatch. The following discussion is, basically, an elaboration of Lee’s explanation on the limitations of mainstream OO languages, and how to avoid them with the aid of PMD.

(Note: click on the diagrams to enlarge them, or, if you prefer, grab a PDF of the whole set.)

Fishes and sharks

Click to enlargeLet’s start by showing why on earth would you need anything beyond Smalltalk’s object system (or any of its modern copycats). Consider a simple oceanographic ecosystem analyser, which deals with (aquatic) Animals, Fishes and Sharks. These are excellent candidates for class definitions, related by inheritance. Moreover, we are after modeling those beasts’ behaviours and, in particular, their reactions when they encounter each other: each time a Shark meets a Fish of other species, the Shark will swallow the other Fish, while when a Shark meets Shark they will fight. As a result of such fighting, Sharks get unhealthy, which regrettably complicates matters: wound sharks won’t try to eat other fishes, and will swim away other sharks instead of fighting them. The image on the left provides a sketchy representation of the code we need to model our zoo. Waters are quickly getting muddled implementation-wise.

On the one hand, subtype polymorphism based just on the object receiving the encounter message: we need, in addition, to take into account the argument’s concrete type to implement the desired behaviour. This is a well-known issue in single-dispatch languages, whose cure is, of course, going to multiple dispatching (see below). In particular, we want to avoid the need to modify existing classes whenever our hierarchy is extended.

On the second hand, varying state (exemplified here by the Shark’s isHealthy instance variable complicates the implementation logic. As we will see, prototype-based languages offer a way to factor out this additional complexity.

Beyond single-dispatch

The need to adjust behaviour on the basis of the type of both a message receiver and its arguments arises frequently in practice. So frequently, that a standard way of dealing with it has been christened as the Visitor design pattern. The technique, also known as double-dispatch, is well known: you can see, for instance, how it’s applied to arithmetic expressions in Smalltalk, or read about a generic implementation of multimethods in Python (which also includes a basically language-independent discussion on the issues at hand). If you happen to be a C++ programmer, you may be tempted to think that global functions and overloading solve the problem in that language. Well, think twice: a proper implementation of multiple dispatch in C++ needs of RTTI and templates, as shown in this article.

Click to enlargeCLOS and Dylan are two examples of languages solving the issue from the onset by including support for multi-methods. The idea is to separate methods from classes (which only contain data slots). As shown in the pseudo-code of the accompanying figure, methods are defined as independent functions with the same name, but differing in their arguments’ types (in CLOS, a set of such methods is called a generic function). When a generic function is called, the system selects the actual method to be invoked using the types of all the arguments used in the invocation. The encounter generic function in our running example provides a typical example, as shown in the figure on the right. The benefits of having multi-methods at our disposal are apparent: the code is simpler and, notably, adding new behaviours and classes to the system does not need modifications of existing code. For instance, we can introduce a Piranha, which eats unhealthy sharks instead of swimming away from them, by defining the requisite class and methods, without any modification whatsoever to the already defined ones.

On the downside, we have still to deal with the complications associated with internal state. Enter the magic world of prototype-based systems.

The ultimate dynamic

If you like dynamic languages, chances are you’ll find prototype-based system an almost perfect development environment. Prototype-based languages emerged as an evolution of Smalltalk with the invention of Self by David Ungar and Randall B. Smith during the late eighties. The key idea behind Self is noticing that, most of the time, class definitions needlessly coerce and complicate your object model.

A class definition becomes a contract to be satisfied by any instance, and it is all too easy to miss future or particular needs of your objects (class-based inheritance is just a partial solution to this problem, as shown, for instance, by the so-called fragile base class problem). But, if you look around you, objects change in internal behaviour and data content continously, and our attempts at distilling their Platonic nature are often in vain.

In prototype-based programming, instead of providing a plan for constructing objects, you simply clone existing instances and modify their behaviour by directly changing the new instance’s slots (which provide uniform access to methods and state). New clones contain a pointer to their parent, from which they inherit non-modified slots: there is no way to access state other than via messages sent to instances, which simplifies tackling with state.

Class-based languages oblige you to keep two relationships in mind to characterize object instances: the “is-a” relationship of the object with its class, and the “kind-of” relationship of that class with its parent. In self, inheritance (or behaviour delegation) is the only one needed. As you can see, Self is all about making working with objects as simple as possible. No wonder Ungar and Smith’s seminal paper was titled Self: The Power of Simplicity. Needless to say, a must read.

Click to enlargeThe figure on the left shows how our running example would look in selfish pseudo-code. As promised, state is no longer surfacing in our method implementation’s logic. Unfortunately, we have lost the benefits of multi-methods in the process. But fear not, for, as we will see, you can eat your cake and have it too. Instead of pseudo-code, you can use Self itself, provided you are the happy owner of a Mac or a Sun workstation. Or you can spend 20 fun minutes seeing the Self video, which features the graphical environment accompanying the system. Like Smalltalk, Self provides you with a computing environment where objects are created, by cloning, and interact with you. The system is as organic and incremental as one can possibly get.

Of course, you’re not limited to Self. For instance, Ken Dickey fleshed up Norman Adams’ saying that objects are poor man closure’s by offering a prototype-based object system in Scheme, and, more recently, Neil Van Dyke has released Protobj. And you have probably already used a very popular language in the family: Javascript. The list goes on, albeit, unfortunately, many of these languages lack either Self’s nice integrated environment, or a portable, up-to-date implementation. Slate to the rescue.

The best of both worlds

Prototyping and multiple dispatch are, at first sight, at odds. After all, method dispatching based on arguments’ type needs, well, a type for each argument, doesn’t it? As it happens, Lee Salzman and Brian Rice have envisioned a way of combining the power of both paradigms into Slate. In fact, proving how this is possible is the crux of Lee’s article. In addition, Slate aims at providing a complete development environment in the vein of Smalltalk or Self. Too good to be true? In future installments of this blog category, we’ll see how and why it’s true, but, if you cannot wait, just run-not-walk to Slate’s site. You’ll have a great time.

Tags: , , , , , ,

Dan Ingalls videos on Object-oriented programming

Dan Ingalls is one of the fathers of Smalltalk. He worked at Xerox PARC with Alan Kay when the latter invented the language and the OO paradigm during the seventies. Dan was the author of virtually all Smalltalk implementations up to Smalltalk-80, including the first one, using, of all languages, BASIC. As Alan explains in his Early History of Smalltalk, the first proof of concept ST was the result of a bet:

One day, in a typical PARC hallway bullsession, Ted Kaeher, Dan Ingalls, and I were standing around talking about programming languages. The subject of power came up and the two of them wondered how large a language one would have to make to get great power. With as much panache as I could muster, I asserted that you could define the “most powerful language in the world” in “a page of code.” They said, “Put up or shut up.” [...] For the next two weeks I got to PARC every morning at four o’clock and worked on the problem until eight [...]

After Alan showed up his newly designed language, he was about to turn back to real work, but:

Much to my surprise, only a few days later, Dan Ingalls showed me the scheme working on the NOVA. He had coded it up (in BASIC!), added a lot of details, such as a token scanner, a list maker, etc., and there it was–running. As he liked to say: “You just do it and it’s done.”

Over the next ten years, Dan made (with some help) at least 80 major releases of various flavors of Smalltalk. (in the meantime, he found time for many other things; for instance, inventing BitBLT.)

Picture 1By 1989, Dan Ingalls was working at Apple (applying Smalltalk to things as nifty as Fabrik), and he recorded this video on Object-Oriented Programming. During about an hour, the basic ideas of OO are reviewed, including some clues on how one implements a single-inheritance classes. Like all great ideas, it is surprisingly simple. The guys at PARC did not stop implementing the language, but went on to developing a complete operating system and environment based on it: Ingalls’ take on OO stresses those aspects, explaining how they were trying to tame complexity and how, for instance, garbage collection (which is really orthogonal to OO) was needed. The video contains a coda consisting in a little question and answer session, where typical objections to OO are reviewed. While not earth-saking, it makes for a funny view, and a review of the basics.

Picture 2 Fast forward to last October 25, when Dan gave a talk entitled Seven (give or take) Smalltalk implementations at Standford. He’s not the only who has gained a fresh new look: his Smalltalk is now Squeak (Dan has been deeply involved in its design and implementation), which he uses to give the talk and demonstrate the joys of an integrated programming ambient. Besides, Dan reviews the motivations and influences leading to Smalltalk, in a guided tour through its many incarnations over the years. There are lots of reminiscences of the Xerox days and his interactions with Alan Kay (including the bet i mentioned above).

Addendum

Early OOTalking about Smalltalk’s early influences, they include, of course Lisp and Simula. But not only them. A delicious piece of trivia that Alan explains in his Early History is his first encounter with an object-oriented design. As it happens, the precursor of OO was an anonymous programmer working for the Air Force around 1960. This unsung hero solved the problem of transporting data files by …

… taking each file and dividing it into three parts. The third part was all of the actual data records of arbitrary size and format. The second part contained the B220 procedures that knew how to get at records and fields to copy and update the third part. And the first part was an array or relative pointers into entry points of the procedures in the second part (the initial pointers were in a standard order representing standard meanings).

From here, Alan took the idea of having procedures attached to objects, without the system needing to know them in advance. Needless to say, Alan thought that was a pretty good idea and, as the saying goes, the rest is history.

But these are only teasers. You should really watch Ingalls’ videos (specially the last one) and read Kay’s article. You will have a great time.

Tags: , , ,

Follow

Get every new post delivered to your Inbox.

Join 26 other followers