Saturday, December 20, 2014

A commentary on 24 days of GHC extensions, part 3

It's time for some more Haskell opinions.

Day 13: Functional Dependencies

See day 12.

The addition of functional dependencies to multi-parameter type classes all of a sudden opened up the possibility of doing type level computations that had not been there before. The type level programming industry seems to have started with Thomas Hallgren's 2001 paper Fun with Functional Dependencies (which got some inspiration from the type level numeric computation available in (classic) Bluespec).

Day 14: Deriving

I don't know the history of deriving Functor et al. I believe Simon PJ added it after enough nagging.

I do have an intense dislike for GeneralizedNewtypeDeriving as implemented in GHC. As illustrated by ghc bug 1496 (found by Stefan O'Rear who was 14-15 at the time) it leads to an inconsistent type system. Instead of removing the root cause of the problem GHC HQ has decided to add a rather ingenious, but grotesque, solution where type variables are given roles. It's complex and it's going to get worse. It also brings its own problems (e.g., join can no longer be made a method in the monad class. WAT?!?!).

In the two compilers where I've implemented GND I've chosen a different route. The compiler tries to construct a proper instance declaration for the derived class. If it can be constructed and type checks then all is well. If it doesn't, there's something tricky going on and the instance has to be written by hand. I claim this covers 99% of all GND uses. And it makes me sleep better at night having had the instance declaration approved by the type checker.

I also prefer to make the compiler recognise and remove the identity functions that newtype gives rise to rather than forcing the user to explicitly using the new Coercible class.

Day 15: Derive Generic

The generic deriving extension is a clever idea from Clean, where all the meta-data about a data type (constructor names, types, etc) is encoded at the type level, and thus available at compile time.

The details of this encoding are rather gruesome, but I think it could be a lot nicer using the new type level strings, type level numbers, and type families. Time for a small redesign?

Day 16: Overloaded String

Overloaded strings was another extension I added to GHC because of my work on the Paradise DSL. Since Paradise is deeply embedded using DSL-level strings would be ugly unless the string literals were overloaded.

It was quite easy to add to GHC. The least easy part was adapting the defaulting rules to strings as well.

Worth noting is that pattern matching on overloaded strings turns into an equality comparison (just as with numeric literals), and so requires the string type to be an instance of Eq, which is not a superclass of IsString. Oh, and the name IsString for the class with the fromString was just temporary since I couldn't think of a good name. I guess it stuck.

Day 17: Rank-N Types

Rank-n types are, of course, a rather old idea, but it took a while before they made it into Haskell. At the Haskell Workshop in La Jolla in 1995 Mark Jones had a very nice paper, From Hindley-Milner Types to First-Class Structures. It shows how to add a form of records to Haskell, where the record components can be polymorhic. This has the same expressivity as current rank-n types, but looks a little different.

For example, the archetypical Haskell rank-2 function has this type now

runST :: (forall s . ST s a) -> a
With records with polymorphic components we have to introduce a new record type to write this. With modern Haskell it would look like
data ArgST a = ArgST (forall s . ST s a)
runST :: ArgST a -> a
Or with Mark's suggested anonymous record types
runST :: { arg :: forall s . ST s a } -> a
The 1996 release of HBC contained this kind of polymorphic components, but it did not have the anonymous records. GHC got rank-n (or was it rank-2?) types in 2000. The GHC rank-n types are somewhat more convenient, but requires function type signatures which the hbc version did not.

At the La Jolla ICFP we had a Haskell committee meeting about the next Haskell release, 1.3. If memory serves me, after a lot of discussions we decided that Haskell was going to get record types. Something along what Mark's paper suggested, but I think they were going to be named. The records would support polymorphic components (thus general universal quantification) and data types would support existentials. But after we split up, I guess the czar got cold feet and we ended up with the current Haskell records (I didn't really like them when I first saw them, and I still don't). Haskell could have been rather different if we had followed through on the decision (I'm not sure about better, but at least different).

Day 18: Existential Types

The idea of existential types is old, but the first time I saw it in an ML-like programming language was Läufer&Odersky's An Extension of ML with First-Class Abstract Types at the ML workshop in San Francisco in June 1992. I promptly implemented it and it was in version 0.998.1 of HBC in July 1992.

Adding existentials to ML is very easy; it's just a matter of tweaking the type checker to accept some more programs. Adding it to Haskell is a little more involved since the existential variables my appear in a context, and so a value of an existential type has to be a pair of a dictionaries and the actual value (ML only needs the actual value). The dictionaries are needed to use overloaded operations on the value.

What about syntax? I did it exactly like Läufer&Odersky, namely free variables in a data declaration are existentially quantified. E.g., data E = C a (a -> Int), since the type variable a does not appear in the head of the data declaration it is existentially quantified.

Some people (I remember Mark Jones and Simon PJ) argued fiercely against this, saying that it would be too easy to make mistakes and forget a type variable and get it existentially qualified. So when GHC finally got them they looked like data E = forall a . C a (a -> Int), using the confusing (but entirely logical!) forall to indicate an existential type. But then GADTs came along, and now it's suddenly no longer important to use the (pointless) forall, i.e., now we can write data E where { C :: a -> (a -> Int) -> E }. Oh, well.

Some people say that existential types is an anti-pattern in Haskell. I don't agree. It has a few perfectly legitimate uses, the problem is just that if you come from an OO background you'll probably reach for existentials in all kind of cases when there are better solutions.

Labels:

Friday, December 19, 2014

A commentary on 24 days of GHC extensions, part 2

Day 7: Type Operators

I understand the usefulness of the type operator GHC extension, but I don't like it. For value identifiers any operator not starting with a ':' is treated the same way as identifiers starting with a lower case letter. This is how it used to be on the type level too. But with the latest version of the TypeOperators extension these operators now behave like upper case identifiers and can be used to name types. An ugly wart, IMO.

Day 8: Recursive do

This extension stems from Levent Erkök's PhD thesis Value Recursion in Monadic Computations. The motivation was to be able to describe circuits with feedback using monadic notation.

There's actually two way to get recursive do-notation. You can either use the rec keyword inside the do or you can use the keyword mdo instead of do. Why mdo, you ask? It should really be μdo, where μ is the recursion operator. But people at GHC HQ are scared of Unicode, so instead we got mdo.

Day 9: Nullary Type Classes

Once you have accepted multi-parameter type classes then nullary type classes are a natural consequence; zero is also a number, after all.

In 1994 I worked on the pH language when I visited MIT. The acronym pH stands for "parallel Haskell", which is what it is. It had some imperative features that were not in a monad. This was in 1994, and the ST monad was just being invented, so we were trying out other things. To keep track of what parts of the program used imperative features I used a nullary type class Imperative that tainted any code that used assignment et al. (As far as I know, this is the earliest use of a nullary type class.) When mixing imperative code and pure code it is important to not be too polymorphic. ML-like languages use the value restriction to accomplish this. Since Haskell's monomorphism restriction is essentially the same as the value restriction, having the Imperative type class neatly accomplished the value restriction for pH.

Day 10: Implicit Parameters

The implicit parameter started as an attempt to simplify Haskell overloading. Haskell overloading, when implemented by dictionary passing, really has two parts: first there is a way to implicitly pass dictionaries around, and second, dictionaries are constructed automatically by using the instance declarations as a logical inference system.

The implicit parameters do the automagic passing of parameters, and is thus half of what is needed for overloading. But they were never used for overloading, instead it was discovered that they can be made to behave like dynamic binding, but with static types. This is quite cool! Even if one doesn't like dynamic binding (I don't).

There idea is presented Erik Meijer et al's paper Implicit Parameters: Dynamic Scoping with Static Types. It's worth reading.

Day 11: Type Families

Type families grew out of the need to have type classes with associated types. The latter is not strictly necessary since it can be emulated with multi-parameter type classes, but it gives a much nicer notation in many cases. The same is true for type families; they can also be emulated by multi-parameter type classes. But MPTC gives a very logic programming style of doing type computation; whereas type families (which are just type functions that can pattern match on the arguments) is like functional programming.

Using closed type families adds some extra strength that cannot be achieved by type classes. To get the same power from type classes we would need to add closed type classes. Which would be quite useful; this is what instance chains gives you.

Day 12: Multi-Parameter Type Classes

Even the original paper on Haskell type classes mentions multiple parameters, so this idea has been with us from the start. But it's not present in Stefan Kaes original work on type classes.

Unfortunately, multiple type class parameters very quickly lead to type ambiguities in the code, and we are forced to add type signatures. So MPTC was not really useful until Mark Jones introduced functional dependencies. Mark got the idea from database theory where the same idea is used to limit the relations in the database.

Labels:

Monday, December 15, 2014

A commentary on 24 days of GHC extensions

Ollie Charles has continued his excellent series of Christmas Haskell blogs. This year he talks about 24 Days of GHC Extensions. His blog posts are an excellent technical introduction to various Haskell extensions. Reading them inspired me to write some non-technical comments about the various extensions; giving a little bit of history and personal comments about them.

Day 1: View Patterns

View patterns have a long history. As far as I know views were first suggested by Phil Wadler in 1987, Views: a way for pattern matching to cohabit with data abstraction. As the title of the paper suggests, it was about being able to pattern match on abstract data types. Variations of Phil's suggestions have been implemented several times (I did it both in LML and for Haskell in hbc). In all these early suggestions the conversion between the concrete type and the view type were implicit, whereas in the ViewPatterns extension the conversion is explicit.

The addition of view patterns was partly spurred by the fact that F# has a very neat way of introducing pattern matching for abstract types, called active patterns. Since Simon Peyton Jones is in the same research lab as Don Syme it's natural that Simon couldn't let F# have this advantage over Haskell. :)

Day 2: Pattern Synonyms

Pattern synonyms is a more recent addition. The first time I remember hearing the idea was this the paper Abstract Value Constructors, and ever since then I wished that Haskell would have something like that. Patterns were one of the few constructs in Haskell that could not be named and reused. The first time they were added to Haskell was in the SHE preprocessor.

At a Haskell hackathon in Cambridge, August 2011, Simon Peyton Jones, Simon Marlow and I met to hash out how they could be added to GHC. I wanted to go beyond simple pattern synonyms. The simplest kind is uni-directional synonyms which can only occur in patterns. The simply bi-directional synonyms can be used both in patterns and expressions, but are limited in what they can do. The explicit bi-directional synonyms allow the full power of both patterns and expressions. With explicit bi-directional synonyms and view patterns we can finally implement views as envisioned by Phil, but now broken down into smaller reusable pieces. The result of our discussions at the hackathon can ge found here. But we were all too busy to implement any of it. All the hard implementation work, and fixing all the additional snags found, was done by Gergő Érdi.

I find this a very exciting extension, and you can take it even further, like Associated Pattern Synonyms in class declarations.

Day 3: Record Wildcards

The record wildcard extension allows you to open a record and get access to all the fields without using qualified names for them. The first time I encountered this idea was in Pascal which as a with-statement. It looks like with expr do stmt where the expr is a record value and inside the stmt all the fields of the record can be accessed directly. The construct later appeared in Ada as well, where it's called use.

So having something similar in Haskell doesn't seem so far fetched. But in was actually the dual, namely to construct values using record wildcard notation that inspired me to make this extension.

In 2006 I was developing the Paradise EDSL which is a DSL embedded in Haskell for describing computations and user interfaces. I wanted a way to make the EDSL less verbose and that's why record wildcards came about. Here's a simple example to illustrate the idea. We want to be able to input a coordinate record.

data Coord = Coord { x :: Double, y :: Double }
    coord :: P Coord
    coord = do
        x <- input 0.0
        y <- input 0.0
        return Coord{..}
This says that x and y are input and that their default value is 0. We need to have an input for every field in the Coord record and at the end we need to construct the record itself. Without the record wildcard the construction would have been Coord{x=x,y=y}. This isn't too bad, but for hundreds of inputs it gets tedious.

I made a quick and dirty implementation of this in GHC, but it was too ugly to get into the official release. Instead Dan Licata reimplemented (and extended) it under Simon PJ's supervision into what we have today.

I'm actually quite ambivalent about this extension. It's incredibly convenient (especially in the pattern form), but it introduces new bound names without an explicit mention of the name in the binder. This makes it harder to read code. The same objection can be made about the Haskell import declaration when it lacks an import list.

Day 4: Bang Patterns

I don't have much to say about BangPatterns. One day they simply appeared in a new GHC version. :)

The Clean language has long has something similar, but in Clean the annotation goes on the type instead of on the pattern. In Haskell it seems like a nice duality to have it on the pattern since there also lazy patterns introduced by ~.

Day 5: Rebindable Syntax

The rebindable syntax is almost like going back the older Haskell reports. They lacked an exact specification of where the identifier introduced by desugaring were bound. Of course, it was resolved so that they always refer to the Prelude identifiers. But when experimenting with other preludes it's very useful to be able to override this. Which is exactly what RebindableSyntax allows you to do.

In my opinion, this extension should be a little different. It ought to be possible to give a module name for where the names should be found. So normally this would be Prelude, but I'd like to be able to say RebindableSyntax=MyPrelude, and then all names introduced by desugaring will be found in MyPrelude (even if they are not in scope).

Day 6: List comprehensions

This bundles together a number of list comprehension extensions.

First, MonadComprehensions, which allows list comprehension syntax for any monad. This is simply going back to Haskell 1.4 where monad comprehensions were introduced. The monad comprehension syntax had been used by Phil Wadler before that. Monad comprehension were removed from 1.4, because they were deemed too confusing for beginners. Monad comprehensions were brought back to GHC by George Giorgidze et al.

The ParallelListComp allows zip like operation to be expression in the comprehension. The idea originates from Galois in the design of Cryptol. John Launchbury, Jeff Lewis, and Thomas Nordin were turning crypto networks into recursive equations and they wanted a nicer notation than using zipWith. (Thanks to Andy Adams-Moran for the history lesson.)

The origins TransformListComp are simple. It's just a way to bring more of the power of SQL into list comprehensions. It's an extension that introduces far too much special syntax for my taste, but the things you can do are quite neat.

Labels: