Saturday, December 20, 2014

A commentary on 24 days of GHC extensions, part 3

It's time for some more Haskell opinions.

Day 13: Functional Dependencies

See day 12.

The addition of functional dependencies to multi-parameter type classes all of a sudden opened up the possibility of doing type level computations that had not been there before. The type level programming industry seems to have started with Thomas Hallgren's 2001 paper Fun with Functional Dependencies (which got some inspiration from the type level numeric computation available in (classic) Bluespec).

Day 14: Deriving

I don't know the history of deriving Functor et al. I believe Simon PJ added it after enough nagging.

I do have an intense dislike for GeneralizedNewtypeDeriving as implemented in GHC. As illustrated by ghc bug 1496 (found by Stefan O'Rear who was 14-15 at the time) it leads to an inconsistent type system. Instead of removing the root cause of the problem GHC HQ has decided to add a rather ingenious, but grotesque, solution where type variables are given roles. It's complex and it's going to get worse. It also brings its own problems (e.g., join can no longer be made a method in the monad class. WAT?!?!).

In the two compilers where I've implemented GND I've chosen a different route. The compiler tries to construct a proper instance declaration for the derived class. If it can be constructed and type checks then all is well. If it doesn't, there's something tricky going on and the instance has to be written by hand. I claim this covers 99% of all GND uses. And it makes me sleep better at night having had the instance declaration approved by the type checker.

I also prefer to make the compiler recognise and remove the identity functions that newtype gives rise to rather than forcing the user to explicitly using the new Coercible class.

Day 15: Derive Generic

The generic deriving extension is a clever idea from Clean, where all the meta-data about a data type (constructor names, types, etc) is encoded at the type level, and thus available at compile time.

The details of this encoding are rather gruesome, but I think it could be a lot nicer using the new type level strings, type level numbers, and type families. Time for a small redesign?

Day 16: Overloaded String

Overloaded strings was another extension I added to GHC because of my work on the Paradise DSL. Since Paradise is deeply embedded using DSL-level strings would be ugly unless the string literals were overloaded.

It was quite easy to add to GHC. The least easy part was adapting the defaulting rules to strings as well.

Worth noting is that pattern matching on overloaded strings turns into an equality comparison (just as with numeric literals), and so requires the string type to be an instance of Eq, which is not a superclass of IsString. Oh, and the name IsString for the class with the fromString was just temporary since I couldn't think of a good name. I guess it stuck.

Day 17: Rank-N Types

Rank-n types are, of course, a rather old idea, but it took a while before they made it into Haskell. At the Haskell Workshop in La Jolla in 1995 Mark Jones had a very nice paper, From Hindley-Milner Types to First-Class Structures. It shows how to add a form of records to Haskell, where the record components can be polymorhic. This has the same expressivity as current rank-n types, but looks a little different.

For example, the archetypical Haskell rank-2 function has this type now

runST :: (forall s . ST s a) -> a
With records with polymorphic components we have to introduce a new record type to write this. With modern Haskell it would look like
data ArgST a = ArgST (forall s . ST s a)
runST :: ArgST a -> a
Or with Mark's suggested anonymous record types
runST :: { arg :: forall s . ST s a } -> a
The 1996 release of HBC contained this kind of polymorphic components, but it did not have the anonymous records. GHC got rank-n (or was it rank-2?) types in 2000. The GHC rank-n types are somewhat more convenient, but requires function type signatures which the hbc version did not.

At the La Jolla ICFP we had a Haskell committee meeting about the next Haskell release, 1.3. If memory serves me, after a lot of discussions we decided that Haskell was going to get record types. Something along what Mark's paper suggested, but I think they were going to be named. The records would support polymorphic components (thus general universal quantification) and data types would support existentials. But after we split up, I guess the czar got cold feet and we ended up with the current Haskell records (I didn't really like them when I first saw them, and I still don't). Haskell could have been rather different if we had followed through on the decision (I'm not sure about better, but at least different).

Day 18: Existential Types

The idea of existential types is old, but the first time I saw it in an ML-like programming language was Läufer&Odersky's An Extension of ML with First-Class Abstract Types at the ML workshop in San Francisco in June 1992. I promptly implemented it and it was in version 0.998.1 of HBC in July 1992.

Adding existentials to ML is very easy; it's just a matter of tweaking the type checker to accept some more programs. Adding it to Haskell is a little more involved since the existential variables my appear in a context, and so a value of an existential type has to be a pair of a dictionaries and the actual value (ML only needs the actual value). The dictionaries are needed to use overloaded operations on the value.

What about syntax? I did it exactly like Läufer&Odersky, namely free variables in a data declaration are existentially quantified. E.g., data E = C a (a -> Int), since the type variable a does not appear in the head of the data declaration it is existentially quantified.

Some people (I remember Mark Jones and Simon PJ) argued fiercely against this, saying that it would be too easy to make mistakes and forget a type variable and get it existentially qualified. So when GHC finally got them they looked like data E = forall a . C a (a -> Int), using the confusing (but entirely logical!) forall to indicate an existential type. But then GADTs came along, and now it's suddenly no longer important to use the (pointless) forall, i.e., now we can write data E where { C :: a -> (a -> Int) -> E }. Oh, well.

Some people say that existential types is an anti-pattern in Haskell. I don't agree. It has a few perfectly legitimate uses, the problem is just that if you come from an OO background you'll probably reach for existentials in all kind of cases when there are better solutions.

Labels:

2 Comments:

Blogger nomeata said...

Hi,

what precisely do you mean by the following?

> I also prefer to make the compiler
recognise and remove the identity functions that newtype gives rise to rather than forcing the user to explicitly using the new Coercible class.

The identitiy functions created by newtype constructors were always removed during compilation. What was not removed was things like "map id", where the id used to be a newtype constructur. Did your compilers remove such a useless map somehow?

Saturday, December 20, 2014 at 2:07:00 PM GMT  
Blogger Ben said...

I was confused for a long time by the presence of the forall quantifier on existential types. Yes, it makes logical sense, but the problem was that many introductions to existential types in Haskell focus on the forall keyword, giving the impression that it has deeper meaning than simply being a syntactic hoop to jump through in order to use a free type variable!

Now that you've explained the situation so clearly, I feel I can finally put to rest the small, nagging sense that I didn't quite understand something, even though I had been using existential types for some time now. Thanks!

I hope the lesson has now been learned that putting these pointless hoops into the language actively harms, rather than helps, learners.

Sunday, December 21, 2014 at 10:41:00 AM GMT  

Post a Comment

<< Home