Saturday, July 09, 2011

Impredicative polymorphism, a use case

In a recent question on stackoverflow I made a comment about how Haskell can be considered a good imperative language because you can define abstractions to make it convenient. When I was going to make my point by implementing a simple example of it I found that what I wanted to do no longer works in ghc-7.0.4. Here's a simple example of what I wanted to do (which works in ghc-6.12.3). It's a simple library that makes Haskell code look a bit like Python.

{-# LANGUAGE ExtendedDefaultRules, TupleSections #-}
module Main where
import qualified Prelude
import Boa

last_elt(xs) = def $: do
    assert xs "Empty list"
    lst <- var xs              -- Create a new variable, lst
    ret <- var (xs.head)
    while lst $: do
        ret #= lst.head        -- Assign variable ret
        lst #= lst.tail
    return ret

first_elt(xs) = def $: do
    l <- var xs
    l.reverse                  -- Destructive reverse
    return (last_elt(l))

factorial(n) = def $: do
    assert (n<=0) "Negative factorial"
    ret <- var 1
    i <- var n
    while i $: do
        ret *= i
        i -= 1
    return ret

test = def $: do
    print "Hello"
    print ("factorial 10 =", factorial(10))

main = do
    test
    l <- varc [1, 2, 3]
    print ("first and last:",)
    print (first_elt(l),)
    print (last_elt(l))

On the whole it's pretty boring, except for one thing. In imperative languages there is (usually) a disctinction between l-values and r-values. An l-value represent a variable, i.e., something that can be assigned, whereas an r-value is simply a value. So in Python, in the statement x = x + 1 the x on the left is an l-value (it's being assigned), whereas on the right it's an r-value. You can use the same notation for both in most imperative languages. If you want to do the same in Haskell you have (at least) two choices. First you can unify the concepts of l-value and r-value and have a runtime test that you only try to assign variables. So, e.g., 5 = x would type check but have a runtime failure. I will not dwell further on this since it's not very Haskelly.
Instead, we want to have l-values and r-values being statically type checked. Here's the interesting bit of the types for a simple example.
 
data LValue a
data RValue a
instance (Num a) => Num (RValue a)

class LR lr
instance LR RValue
instance LR LValue

var :: RValue a -> IO (forall lr . (LR lr) => lr a)
(#=) :: LValue a -> RValue a -> IO ()

foo = do
    x <- var 42
    x #= x + 1

We have two type constructors LValue and RValue representing l-values and r-values of some type a. The r-values is an instance of Num. Furthermore, the class LR where the type is either LValue or RValue.
The var function creates a new variable given a value. The return type of var is the interesting part. It says that the return value is polymorphic; it can be used either as an l-value or as r-value. Just as we want.
The assignment operator, (#=), takes an l-value, an r-value, and returns nothing.

So in the example we expect x to have type forall lr . (LR lr) => lr a, in which case the assignment will type check.
If we try to compile this we get
    Illegal polymorphic or qualified type:
      forall (lr :: * -> *). LR lr => lr a
    Perhaps you intended to use -XImpredicativeTypes
    In the type signature for `var':
      var :: RValue a -> IO (forall lr. LR lr => lr a)

The problem is that universal quantification is not normally allowed as the argument to a type constructor. This requires the impredicative polymorphism extension to ghc. If we turn it on it compiles fine in ghc-6.12.3.
But, with ghc-7.0.4 we get
    Couldn't match expected type `LValue a0'
                with actual type `forall (lr :: * -> *). LR lr => lr a1'
    In the first argument of `(#=)', namely `x'
    In the expression: x #= x + 1

I can't really explain the rational behind the change in the ghc type system (Simon say it's simpler now), but I feel this has really made ghc worse for defining DSELs. When you define a DSEL you usually use the do-notation for the binding construct in the embedded language. This is the only programmable binding construct (by overloading (>>=)), so there is little choice. With the change in ghc-7 it means you can no longer make DSELs with a polymorhic binding construct (like I wanted here), because the binder seems to be monomorphic now

Please Simon, can we have polymorphic do bindings back?