Wednesday, December 10, 2008

The OCaml code again

I'm posting a slight variation of the OCaml code that I think better shows what I like about the ML version.
module MkPerson(O: sig 
                     type xString
                     type xDouble
                     val opConcat : xString -> xString -> xString
                     val opShow : xDouble -> xString
                   end) =
struct
  open O
  type person = Person of (xString * xString * xDouble)
  let display (Person (firstName, lastName, height)) = 
    opConcat firstName (opConcat lastName (opShow height))
end

module BasicPerson = MkPerson(struct
                                type xString = string
                                type xDouble = float
                                let opConcat = (^)
                                let opShow = string_of_float
                              end)

open BasicPerson

let _ = 
  let p = Person ("Stefan", "Wehr", 184.0)
  in display p
Note how the open O opens the argument to the MkPerson functor and all the values and types from the argument is in scope in the rest of the module. There's no need to change lots of code in MkPerson.

Similarely, the open BasicPerson makes the operations from that module avaiable, so that Person and display can be used in a simple way.

Labels: , , ,

Abstracting on, suggested solutions

I guess I should be more constructive than just whining about how Haskell doesn't always do what I want. I do have some suggestions on how to fix things.

Explicit type applications

Let's look at a simple example again:
f :: forall a . a -> a
f = \ x -> x

b :: Bool
b = f True
The way I like to think of this (and what happens in ghc) is that this is shorthand for something more explicit, namely the Fw version of the same thing. In Fw all type abstraction and type application are explicit. Let's look at the explicit version (which is no longer Haskell).
f :: forall (a::*) . a -> a
f = /\ (a::*) -> \ (x::a) -> x

b :: Bool
b = f @Bool True
I'm using /\ for type abstraction and expr @type for type application. Furthermore each binder is annotated with its type. This is what ghc translates the code to internally, this process involves figuring out what all the type abstractions and applications should be.

Not something a little more complicated (from my previous post)

class C a b where
    x :: a
    y :: b

f :: (C a b) => a -> [a]
f z = [x, x, z]
The type of x is
x :: forall a b . (C a b) => a
So whenever x occurs two type applications have to be inserted (there's also a dictionary to insert, but I'll ignore that). The decorated term for f (ignoring the context)
f :: forall a b . (C a b) => a -> [a]
f = /\ (a::*) (b::*) -> \ (z::a) -> [ x @a @b1, x @a @b2, z]
The reason for the ambiguity in type checking is that the type check cannot figure out that the b is in any way connected to b1 and b2. Because it isn't. And there's currently no way we can connect them.

So I suggest that it should be possible to use explicit type application in Haskell when you want to. The code would look like this

f :: forall a b . (C a b) => a -> [a]
f z = [ x @a @b, x @a @b, z]
The order of the variables in the forall determines the order in which the type abstractions come, and thus determines where to put the type applications.

Something like abstype

Back to my original problem with abstraction. What about if this was allowed:
class Ops t where
    data XString t :: *
    (+++) :: XString t -> XString t -> XString t

instance Ops Basic where
    type XString Basic = String
    (+++) = (++)
So the class declaration says I'm going to use data types (which was my final try and which works very nicely). But in the instance I provide a type synonym instead. This would be like using a newtype in the instance, but without having to use the newtype constructor everywhere. The fact that it's not a real data type is only visible inside the instance declaration. The compiler could in fact make a newtype and insert all the coercions. This is, of course, just a variation of the abstype suggestion by Wehr and Chakravarty.

Labels: , ,

The abstraction continues

I got several comments to my lament about my attempts at abstraction in my previous blog post. Two of the comments involve adding an extra argument to display. I dont regard this as an acceptable solution; the changes to the code should not be that intrusive. Adding an argument to a function is a change that ripples through the code to many places and not just the implementation of display.

Reiner Pope succeeded where I failed. He split up the operations in Ops into two classes and presto, it works.

data Person t = Person {
    firstName :: XString t,
    lastName :: XString t,
    height :: XDouble t
    }

class (Show s, IsString s) => IsXString s where
    (+++) :: s -> s -> s
class (Num d, IsXString s) => IsXDouble s d where
    xshow :: d -> s

class (IsXDouble (XString t) (XDouble t)) => Ops t where
    type XString t :: *
    type XDouble t :: *
instance IsXString String where
    (+++) = (++)
instance IsXDouble String Double where
    xshow = show

data Basic = Basic

instance Ops Basic where
    type XString Basic = String
    type XDouble Basic = Double

display :: Ops t => Person t -> XString t
display p = firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
That's neat, but a little fiddly if there are many types involved.

Another problem

Armed with this solution I write another function.
incSpace :: (Ops t) => XDouble t -> XString t
incSpace x = xshow x +++ " "
It typechecks fine. But as far as I can figure out there is no way to use this function. Let's see what ghci says:
> :t incSpace (1 :: XDouble Basic) :: XString Basic

:1:0:
    Couldn't match expected type `[Char]'
           against inferred type `XString t'
    In the expression: incSpace (1 :: XDouble Basic) :: XString Basic

:1:10:
    Couldn't match expected type `XDouble t'
           against inferred type `Double'
    In the first argument of `incSpace', namely `(1 :: XDouble Basic)'
    In the expression: incSpace (1 :: XDouble Basic) :: XString Basic
Despite my best efforts at providing types it doesn't work. The reason being that saying, e.g., (1 :: XDouble Basic) is the same as saying (1 :: Double). And that doesn't match XDouble t. At least not to the typecheckers knowledge.

In the example of display things work because the parameter t occurs in Person t which is a real type and not a type family. If a type variable only occurs in type family types you are out of luck. There's no way to convey the information what that type variable should be (as far as i know). You can "solve" the problem by adding t as an argument to incSpace, but again, I don't see that as a solution.

In the paper ML Modules and Haskell Type Classes: A Constructive Comparison Wehr and Chakravarty introduce a notion of abstract associated types. That might solve this problem. I really want XDouble and XString to appear as abstract types (or associated data types) outside of the instance declaration. Only inside the instance declaration where I provide implementations for the operations do I really care what the type is.

A reflection on type signatures

If I write
f x = x
Haskell can deduce that the type is f :: a -> a.

If I instead write

f :: Int -> Int
f x = x
Haskell happily uses this type. The type checker does not complain as to say "Sorry dude, but you're wrong, the type is more general than what you wrote.". I think that's nice and polite.

Now a different example.

class C a b where
    x :: a
    y :: b

f z = [x, x, z]
What does ghc have to say about the type of f?
f :: (C a b, C a b1) => a -> [a]
OK, that's reasonable; the two occurences of x could have different contexts. But I don't want them to. Let's add a type signature.
f :: (C a b) => a -> [a]
f z = [x, x, z]
What does ghc have to say?
Blog2.hs:9:7:
    Could not deduce (C a b) from the context (C a b2)
      arising from a use of `x' at Blog2.hs:9:7
    Possible fix:
      add (C a b) to the context of the type signature for `f'
    In the expression: x
    In the expression: [x, x, z]
    In the definition of `f': f z = [x, x, z]

Blog2.hs:9:10:
    Could not deduce (C a b1) from the context (C a b2)
      arising from a use of `x' at Blog2.hs:9:10
    Possible fix:
      add (C a b1) to the context of the type signature for `f'
    In the expression: x
    In the expression: [x, x, z]
    In the definition of `f': f z = [x, x, z]
Which is ghc's way of say "Dude, I see your context, but I'm not going to use it because I'm more clever than you and can figure out a better type." Rude, is what I say.

I gave a context, but there is nothing to link the b in my context to what ghc internally figures out that the type of the two occuerences of x should. I wish I could tell the type checker, "This is the only context you'll ever going to have, use it if you can." Alas, this is not how things work.

A little ML

Stefan Wehr provided the ML version of the code that I only aluded to
module MkPerson(O: sig 
                     type xString
                     type xDouble
                     val opConcat : xString -> xString -> xString
                     val opShow : xDouble -> xString
                   end) =
struct
  type person = Person of (O.xString * O.xString * O.xDouble)
  let display (Person (firstName, lastName, height)) = 
    O.opConcat firstName (O.opConcat lastName (O.opShow height))
end

module BasicPerson = MkPerson(struct
                                type xString = string
                                type xDouble = float
                                let opConcat = (^)
                                let opShow = string_of_float
                              end)

let _ = 
  let p = BasicPerson.Person ("Stefan", "Wehr", 184.0)
  in BasicPerson.display p
In this case, I think this is the natural way of expressing the abstraction I want. Of course, this ML code has some shortcomings too. Since string literals in ML are not overloaded the cannot be used neatly in the display function like I could in the Haskell version, but that's a minor point.

Labels: , ,

A somewhat failed adventure in Haskell abstraction

I usually blog about weird and wonderful things you can do in Haskell. Today I'm going to talk about something very plain and not wonderful at all.

If you want to try out the code below, use these Haskell extensions:

{-# LANGUAGE TypeFamilies, MultiParamTypeClasses, OverloadedStrings,
   FlexibleInstances, TypeSynonymInstances, ScopedTypeVariables,
   FunctionalDependencies, RecordWildCards, FlexibleContexts,
   GeneralizedNewtypeDeriving #-}

The simple problem

We want to define a type for a person which has a few fields and operations. Like this
module Person(Person(..), display) where

data Person = Person {
   firstName :: String,
   lastName  :: String,
   height    :: Double
   }

display :: Person -> String
display p = firstName p ++ " " ++ lastName p ++ " " ++ show (height p + 1)
Very simple. To use it we can just import the module and the write something like
 print $ display $ Person { firstName = "Manuel", lastName = "Peyton Jones", height = 255 }
But being efficiancy conscious I'm not happy with using String and Double. I'd like to experiment with different types for these. Maybe I should use ByteString and Int instead?

Simple enough, let's abstract out the types and operations into a different module.

module Ops(XString, XDouble, (+++), xshow) where
import Data.String
newtype XString = XString String deriving (Eq, Show, IsString)
newtype XDouble = XDouble Double deriving (Eq, Show, Num)

(+++) :: XString -> XString -> XString
XString x +++ XString y = XString (x ++ y)

xshow :: XDouble -> XString
xshow (XDouble x) = XString (show x)

module Person(Person(..), display) where
import Ops

data Person = Person {
   firstName :: XString,
   lastName  :: XString,
   height    :: XDouble
   }

display :: Person -> XString
display p = firstName p +++ " " +++ lastName p +++ " " +++ show (height p + 1)
There, problems solved. By changing the import in the Person module you can try out different types for XString and XDouble.

No, this is not problem solved. To try out different implementations I need to edit the Person module. That's not abstraction, that's obstruction. It should be possible to write the code for the Person module once and for all once you decided to abstract, and then never change it again.

I also didn't really want to necessarily have newtype in my module. Maybe I'd want this:

module Ops(XString, XDouble, (+++), xshow) where
type XString = String
type XDouble = Double

(+++) :: XString -> XString -> XString
(+++) = (++)

xshow :: XDouble -> XString
xshow = show
You can define Ops that way, but then the implementation of Ops may leak into the Person module. What you really want is to type check Person against the signature of Ops, like
interface Ops where
type XString
type XDouble
(+++) :: XString -> XString -> XString
xshow :: XDouble -> XString
And later supply the actual implementation. Alas, Haskell doesn't allow this.

In ML (SML or O'Caml) this would be solved by using a functor. The Person module would be a functor that takes the Ops module as an argument and yields a new module. And then you can just plug and play with different Ops implementations. This is where ML shines and Haskell sucks.

Type classes

But the defenders of Haskell superiority say, Haskell has type classes, that's the way to abstract! So let's make Ops into a type class. Let's do old style with multiple parameters first. Since Ops defines two types it will correspond to having two type parameters to the class.
class (IsString xstring, Num xdouble) => Ops xstring xdouble where
   (+++) :: xstring -> xstring -> xstring
   xshow :: xdouble -> xshow
Ok, so how do we have to rewrite the Person module?
data Person xstring xdouble = Person {
   firstName :: xstring,
   lastName  :: xstring,
   height    :: xdouble
   }

display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p = firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
An implementation is provided by an instance declaration:
instance Ops String Double where
   (+++) = (++)
   xshow = show
We see the major flaw in this approch at once. The Person data type now has two parameters. This might be bearable, but imagine a more complicated example where Ops contains 15 types. And every time you add a field with a new type to Person you have to update every single place in the program that mentions the Person type. That's not abstraction.

But in fact, it's even worse than that. The definition of display might look plausible, but it's full of ambiguities. Compiling it gives lots of errors of this kind:

   Could not deduce (Ops xstring xdouble)
     from the context (Ops xstring xdouble4)
Well, we can remove the type signature and let GHC figure it out. The we get this
display :: (Ops xstring xdouble,
           Ops xstring xdouble3,
           Ops xstring xdouble2,
           Ops xstring xdouble1,
           Ops xstring xdouble4) =>
          Person xstring xdouble4 -> xstring
And this function can, of course, never be used because most of the type variables do not occur outside the context so they will never be determined. I don't even know how to put explicit types in the function to make it work.

Well, it's common knowledge that multi-parameter type classes without functional dependencies is asking for trouble. So can we add some functional dependencies? Sure, if we use

class (IsString xstring, Num xdouble) => Ops xstring xdouble | xstring -> xdouble where
then things work beautifully. Until we decide that another instance that would be interesting to try is
instance Ops String Int
which is not valid with the FD present.

So we can't have functional dependencies if we want to have flexibilty with the instances. So what is it that goes wrong without the FDs? It's that all the uses (+++) and xshow are not tied together, they could potentially have different types. Let's try and be sneaky and tie them together:

display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p =
   let (++++) = (+++); xxshow = xshow
   in  firstName p ++++ " " ++++ lastName p ++++ " " ++++ xxshow (height p + 1)
This only generates one error message, because there's still nothing that says the the two operations come from the same instance. We need to make the tie even closer.
class (IsString xstring, Num xdouble) => Ops xstring xdouble where
   ops :: (xstring -> xstring -> xstring, xdouble -> xstring)
instance Ops String Double where
   ops = ((++), show)

display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p =
   let ((+++), xshow) = ops
   in  firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
This actually works! We can make it neater looking.
class (IsString xstring, Num xdouble) => Ops xstring xdouble where
   ops :: DOps xstring xdouble

data DOps xstring xdouble = DOps {
   (+++) :: xstring -> xstring -> xstring,
   xshow :: xdouble -> xstring
   }

instance Ops String Double where
   ops = DOps (++) show

display :: (Ops xstring xdouble) => Person xstring xdouble -> xstring
display p =
   let DOps{..} = ops
   in  firstName p +++ " " +++ lastName p +++ " " +++ xshow (height p + 1)
We have basically packaged up the dictionary and unpack it ourselves to get access to the operations. It's not pleasent, but it works.

But as I already said, the multiparameter type class version isn't really a good solution to the problem even if it works; it introduces too many parameters to the Person record.

Associated types

The new and shiny way of doing type classes is to use associated types instead of FDs. So let's give that a try. So what should the associated types be in the class. The associated type is supposed to be the one that can be computed from the main one. But we have two types that are on equal footing, so there is no main one. We can remedy that by introducing an artificial third type that is the main one, it can then determine the other two.
class (IsString (XString t), Num (XDouble t)) => Ops t where
   type XString t :: *
   type XDouble t :: *
   (+++) :: XString t -> XString t -> XString t
   xshow :: XDouble t -> XString t

data Person t = Person {
   firstName :: XString t,
   lastName  :: XString t,
   height    :: XDouble t
   }
That looks pretty neat. Note how the Person record has one parameter and no matter how many new associated type we add it will still only have one parameter. One parameter is reasonable, the Person record is after all parameterized over what kind of Ops we are providing.

Let's do an instance. It will need the extra type that is somehow the name of the instance.

data Basic = Basic

instance Ops Basic where
   type XString Basic = String
   type XDouble Basic = Double
   (+++) = (++)
   xshow = show
Now what about the display function? Alas, now it breaks down again. The display function is full of type errors again. And the reason is similar to the multiparameter version; there's nothing that ties the operations together.

We can play the same trick as with DOps above, but for some reason it doesn't work this time. The type comes out as

display :: (XString t ~ XString a,
           XDouble t ~ XDouble a,
           Ops a,
           Num (XDouble t)) =>
          Person t -> XString a
I have no clue why. I find associated types very hard to get a grip on.

OK, multi-parameter type classes made things work, but had too many type parameters. And associated types is the other way around. You can try combining them, but it didn't get me anywhere closer.

Associated data types

OK, I won't admit defeat yet. There's still associated data types. They are easier to deal with than associated types, because the type function is guaranteed to be injective.
class (IsString (XString t), Num (XDouble t)) => Ops t where
   data XString t :: *
   data XDouble t :: *
   (+++) :: XString t -> XString t -> XString t
   xshow :: XDouble t -> XString t

data Basic = Basic

instance Ops Basic where
   newtype XString Basic = XSB String deriving (Eq, Ord, Show)
   newtype XDouble Basic = XDB Double deriving (Eq, Ord, Show)
   XSB x +++ XSB y = XSB (x ++ y)
   xshow (XDB x) = XSB (show x)
instance Num (XDouble Basic) where
   XDB x + XDB y = XDB (x+y)
   fromInteger = XDB . fromInteger
instance IsString (XString Basic) where
   fromString = XSB
At last, this actually works! But it's at a price. We can no longer use the types we want in the instance declaration, instead we are forced to invent new types. Using this approach the original multi-parameter version could have been made to work as well.

Normally the GeneralizedNewtypeDeriving language extension makes it relatively painless to introduce a newtype that has all the instances of the underlying type. But due to a bug in ghc you can't use this extension for associated newtypes. So we have to make manual instance declarations which makes this approach very tedious.

Conclusion

I have found no way of doing what I want. My request is very simple, I want to be able to abstract over the actual implementation of a module, where the module contains types, values, and instances.

Haskell normally excels in abstraction, but here I have found no natural way of doing what I want. Perhaps I'm just not clever enough to figure out how, but that is a failure of Haskell too. It should not take any cleverness to do something as simple as this. In ML this is the most natural thing in the world to do.

Associated types are not a replacement for a proper module system. They let you do some things, but others just don't work.

I'd be happy to see anyone doing this in Haskell in a simple way.

Labels: , ,