Suppose you had a list that you wanted to split according to some predicate, like splitting it up into the portion that’s greater than 7, and the portion that’s less than 7. Then we could take advantage of the partition
function from Data.List
:
-- A reminder, the type of partition is:
partition :: (a -> Bool) -> [a] -> ([a], [a])
-- | Moves the elements of a given list that are greater than 7 to the front.
greaterThan7sFirst :: [Int] -> [Int]
=
greaterThan7sFirst xs let (lessThans, greaterThans) = partition (>7) xs
in greaterThans ++ lessThans
Pretty clean, right? Well when we run it:
> greaterThan7sFirst [4,9,4,8,2,7,9,2,6]
ghci4,4,2,7,2,6,9,8,9] [
Oops! It looks like we put things in the wrong order. The fix is simple, of course, just swap the order in the tuple.
But it’d be nice if we could prevent these sorts of mix-ups from happening in the first place. Also,
it would be nice if we could reflect this in the type signature of partition
, that way we don’t even need to read the documentation
every time we want to remember which half of the tuple is which.
The “separate newtype
s” Approach
So we want to reflect the different meanings of the lists in the type system. So we can create a couple of newtypes:
newtype TruePart a = Trues [a]
deriving (Eq, Ord, Show, Read)
newtype FalsePart a = Falses [a]
deriving (Eq, Ord, Show, Read)
Then our result tuple will simply contain a true part and a false part.
partitionTyped :: (a -> Bool) -> [a] -> (TruePart a, FalsePart a)
=
partitionTyped p xs let (trues, falses) = partition p xs
in (Trues trues, Falses falses)
Already we have more clarity in the type signature. Now we can adjust our greaterThan7sFirst
function to use partitionTyped'
:
-- | Moves the elements of a given list that are greater than 7 to the front.
greaterThan7sFirst :: [Int] -> [Int]
=
greaterThan7sFirst xs let (Falses lessThans, Trues greaterThans) = partitionTyped (>7) xs
in greaterThans ++ lessThans
Immediately we get a complaint:
• Couldn't match type ‘TruePart Int’ with ‘FalsePart a’
Expected type: (FalsePart a, TruePart a1)
Actual type: (TruePart Int, FalsePart Int)
This is great! The type checker has successfully saved us from our own foolishness. Fixing our function makes it type check:
-- | Moves the elements of a given list that are greater than 7 to the front.
greaterThan7sFirst :: [Int] -> [Int]
=
greaterThan7sFirst xs let (Trues greaterThans, Falses lessThans) = partitionTyped (>7) xs
in greaterThans ++ lessThans
And testing it out, we have our desired behaviour:
> greaterThan7sFirst [4,9,4,8,2,7,9,2,6]
ghci9,8,9,4,4,2,7,2,6] [
This approach does have the downside that we’ve had to introduce two new datatypes to accomplish the task. Any instances we write for one of them need to be duplicated for the other. What if we had a single type that did both jobs?
The Phantom Types Approach
Consider the following type instead:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
newtype Part (bool :: Bool) a = Part { unPart :: [a] }
This type has a seemingly extraneous type parameter with what appears to be a type signature on it, (bool :: Bool)
.
This field is indeed “extraneous” in a sense, but what it does is that when we construct a Part
we can choose
what we want bool
to be. If we have two values where the value of bool
is different, then we won’t be able to
confuse one for the other, since their types will be different, even if the actual runtime values are the same.
bool
is called a phantom type parameter, since it doesn’t appear on the right-hand side of the =
.
So what sorts of values can bool
take on? Well, that syntax that looks like a type signature is actually a kind signature.
So bool
exists on the type level and thus has a kind that dictates what sort of values it can take on. In this case,
its kind is Bool
, so it can take on any of the values of Bool
, meaning True
or False
. This can be a bit confusing
since normally True
and False
are runtime values. The DataKinds
extension essentially creates two new types
called 'True
and 'False
with kind Bool
. These are separate from the value level True
and False
. The compiler
allows us to leave off the tick marks if it’s still clear what we’re referring to. 1
What does this look like in practice? Here’s an example:
truePart :: Part True Int
= Part [1,2,3,4]
truePart
falsePart :: Part False Int
= Part [1,2,3,4]
falsePart
-- The following gives a type error!
= [truePart, falsePart] parts
• Couldn't match type ‘'False’ with ‘'True’
Expected type: Part 'True Int
Actual type: Part 'False Int
So we can’t define parts
since it would have two elements of different types, even though
their contents are the same. This is a good thing, we can use this to implement
a type-safe partition
:
partitionTyped :: (a -> Bool) -> [a] -> (Part True a, Part False a)
=
partitionTyped p xs let (trues, falses) = partition p xs
in (Part trues, Part falses)
There’s an issue when we go to implement greaterThan7sFirst
, though. We don’t have two
separate constructors to match on as we did with the newtypes. We can work around that,
but it’s not very ergonomic:
greaterThan7sFirst :: [Int] -> [Int]
=
greaterThan7sFirst xs let greaterThans :: Part True Int
lessThans :: Part False Int
= partitionTyped (>7) xs
(greaterThans, lessThans) in unPart greaterThans ++ unPart lessThans
It would be much nicer to have separate constructors that we could match on like with the two newtypes. Turns out pattern synonyms are just the thing we need!
{-# LANGUAGE PatternSynonyms #-}
pattern Trues :: [a] -> Part True a
pattern Trues xs = Part xs
{-# COMPLETE Trues #-}
pattern Falses :: [a] -> Part False a
pattern Falses xs = Part xs
{-# COMPLETE Falses #-}
So pattern synonyms allow us to define, well, synonyms for existing patterns. In this case,
we’ve defined two patterns Trues
and Falses
that are the same as Part
. The only difference
is that we’ve restricted their types. If we ask GHCi the type of Part
we get:
> :t Part
ghciPart :: [a] -> Part bool a
So of course, Part
can return a Part
with any value of bool
that we choose. In the case of Trues
we
want to restrict its type to only match when the type of the receiving value has a bool
parameter
equal to True
. By giving Trues
this restricted type we accomplish two things:
- Any values created with
Trues
will haveTrue
as thebool
type parameter. - Any values matched on by
Trues
must haveTrue
as thebool
type parameter.
In this case, we’re concerned with the second behaviour, but the first is nice to have as well. So if we go into GHCi we can witness these two facts in action:
> -- Fact 1:
ghci> :t Trues "abcd"
ghciTrues "abcd" :: Part 'True Char
> -- Fact 2:
ghci> Trues t = truePart
ghci> Falses f = truePart
ghci
<interactive>:9:12: error:
Couldn't match type ‘'True’ with ‘'False’
• Expected type: Part 'False Int
Actual type: Part 'True Int
In the expression: truePart
• In a pattern binding: Falses f = truePart
The COMPLETE
pragmas above are there to tell GHC that when pattern matching, matching only on
Trues
(or only on Falses
) constitutes a complete pattern match, this way GHC doesn’t give
us a “missing patterns” warning if we don’t provide a fallback case.
Now we can implement greaterThan7sFirst
nicely:
greaterThan7sFirst :: [Int] -> [Int]
=
greaterThan7sFirst xs let (Trues greaterThans, Falses lessThans) = partitionTyped (>7) xs
in greaterThans ++ lessThans
And if we mess up then we get an error:
greaterThan7sFirst :: [Int] -> [Int]
=
greaterThan7sFirst xs let (Falses lessThans, Trues greaterThans) = partitionTyped (>7) xs
in greaterThans ++ lessThans
• Couldn't match type ‘'True’ with ‘'False’
Expected type: (Part 'False Int, Part 'True Int)
Actual type: (Part 'True Int, Part 'False Int)
Nice. As a bonus, let’s implement Show
instances that display the results
using our pattern synonym syntax. We’ll need FlexibleInstances
to convince GHC
that it’s okay to write separate instances for True
and False
.
{-# LANGUAGE FlexibleInstances #-}
instance Show a => Show (Part False a) where
showsPrec n (Falses xs) = showParen (n > 10) (showString "Falses " . shows xs)
instance Show a => Show (Part True a) where
showsPrec n (Trues xs) = showParen (n > 10) (showString "Trues " . shows xs)
> partitionTyped (>7) [4,9,4,8,2,7,9,2,6]
ghciTrues [9,8,9],Falses [4,4,2,7,2,6]) (
Wordle: Guesses and Masters
This particular motivating example may feel a bit overcomplicated for what it’s trying to do, and
indeed the solution with two newtypes was already quite satisfactory. The benefits to using phantom types
with DataKinds
really come up more in situations where you may want to work more with the type, unlike here where it exists purely to be pattern matched on immediately like in this example.
As an example, a while ago I wrote a Wordle solver that makes use of this technique to differentiate between words that represent a guess versus words that represent a possible solution. In that case, I had something like
-- "Wordlet" is a cutesy word that I use to refer to a Wordle word.
data WordletType = Guess | Master
newtype Wordlet (wlty :: WordletType) = Wordlet { ... }
newtype WordList (wlty :: WordletType) = WordList { getWordList :: Data.Vector.Vector (Wordlet wlty) }
Because words and word lists are tagged, it means that on the one hand, I have type safety, so if I have a function like
againstMaster :: Wordlet Guess -> Wordlet Master -> Colors
which checks a given guess word against a candidate master word, then I can use it to write the following function, which checks if a given guess word and feedback to that guess word is consistent with some possible master word:
consistentWith :: Colors -> Wordlet Guess -> Wordlet Master -> Bool
= guess `againstMaster` master == colors consistentWith colors guess master
This is just a small example, but the fact that words are tagged ensures that I can’t mess up
the argument order when applying againstMaster
. Another example would be the filterMasters
function, which filters a word list full of master word candidates to only those that
are consistent with a given guess word and response to that guess word
filterMasters :: Colors -> Wordlet Guess -> WordList Master -> WordList Master
WordList masters) =
filterMasters colors guess (WordList $ Data.Vector.filter (consistentWith colors guess) masters
The types guide us towards a solution. We have a vector of WordList Master
s and we want
to filter it. So of course we need Data.Vector.filter
. The filtering function needs to combine
some Colors
, a Wordlet Guess
and a WordLet Master
in some way to give us a Bool
. This is
exactly consistentWith
. 2
When I first added these phantom types to my code it indeed found a couple of places where I was implicitly confusing a master word for a guess word.
As far as how this is beneficial over a couple of newtypes, it means that I don’t need separate functions for dealing with guess words and master words in cases where it doesn’t matter. For example:
wordletToString :: Wordlet wlty -> String
parseWordlet :: Text -> Either Text (Wordlet wlty)
In both of these cases it doesn’t matter whether we’re dealing with a guess word or a master word, and the phantom types approach makes it very easy to be polymorphic over the type of word, since the implementation is the same, and thus they only involve a single function body.
So we’ve learned today that phantom types give us the ability to both communicate more in our type signatures, and they can help us prevent mistakes and misuses by allowing us to track where a value came from.