In my last post, I described how integrating with a filtering API for querying a database gave me an opportunity to apply mutually recursive types. In this post, I'll explain another unusual technique it led me to use—phantom types—and how to apply it.
The filtering API I mentioned represents its criteria using lists of conditions that satisfy the filter. The specification describes:
- The possible conditions available
- The data types the filter supports
- Which conditions apply to which types
Here's a chart based on a simplified version of the specification:
|Is||Less than||Greater than||One of||Contains|
The possible conditions could be modeled as a type in Elm this way:
Having defined our
Condition type, we might choose to only expose type aliases for the data types supported by the filter, so that these are the only options available to client code:
And in order to allow us to change the implementation of our
Condition type without changing the interface to it, we might expose functions to construct
Conditions rather than exposing the
Condition type's variants themselves:
This is a good start to an API for allowing us to construct
Conditions. But there's another constraint we haven't accounted for: not all
Conditions should be combined with all the supported data types:
Conditions aren't just illogical; they're also invalid. The chart I showed earlier describes only certain subsets of the possible variants that are compatible with each type. The remaining ones couldn't be included in requests to the filter API.
Our problem, then, is that it's possible to instantiate invalid
Conditions using our exposed constructor functions. How might we make these invalid states impossible to represent at the type level?
One possible solution: type-specific constructor functions
One obvious solution is to simply write constructor functions for
IntConditions, for only the variants they're compatible with. This would look something like this:
This does enforce the constraint on the allowed combinations of conditions and data types. But the downside is that there are now 13 functions to write instead of 5, functions that will need to be remembered and made sense of in client code. And that's just in our simplified example; in the actual specification for the filter I was working with, there would be 44.
Introducing phantom types
Another solution to this problem is to use so-called "phantom" types. A phantom type gets its name from its declaration of a type parameter that, counterintuitively, is not used inside its definition:
Usually, type parameters are used to allow a type to be instantiated in terms of another type.
List's definition as
List a allows us to use the
List API to work with
Lists of any type of data.
The type parameters in a phantom type serve a different purpose: they allow us to write functions that will only accept that type with certain parameters. Here's an example use case that can be improved by this technique:
This model of a
Door can have
Unlocked. We'd like to write a function
openDoor that maintains the constraint that only an
Unlocked door will return the
Room it leads to.
With the current model of
Door, this is the best we can do:
We've modeled the possible states of a door correctly, but since
Door Locked and
Door Unlocked are both valid instances of
Door, we cannot guarantee a function
openDoor can return a
Room instead of a
Let's change the
LockState in the definition of
Door into a type parameter, to make
Door a phantom type:
Let's also make
Unlocked into separate types with one variant each:
type Locked = Locked type Unlocked = Unlocked
Now we can write a function
openDoor that will only accept a
Door Unlocked as an argument, and will return a
If we instantiated a
Door Locked, and attempted to pass it to
openDoor, the compiler would produce this error:
lockedDoor : Door Locked lockedDoor = Door thisWontWork : Room thisWontWork = openDoor lockedDoor ---- This `lockedDoor` value is a: Door Locked But `openDoor` needs the 1st argument to be: Door Unlocked
Door Locked was a possible value of the
Door type. After our changes,
Door Locked is now a type itself, that can be used in a function signature.
Let's apply this technique to our
Condition problem. My solution was to categorize the variants into groups based on their compatibility with different types:
|Is||Is less than, is greater than||Contains||Is one of|
The compatibility of these groups with a particular type of
Condition can be represented by type parameters
multiple. Adding these parameters to the declaration makes
Condition a phantom type:
Now, we can define single-value types to use in the definition of
Condition type aliases and constructor functions:
type SubStringAllowed = SubStringAllowed type SubStringNotAllowed = SubStringNotAllowed type ComparisonAllowed = ComparisonAllowed type ComparisonNotAllowed = ComparisonNotAllowed type MultipleValueAllowed = MultipleValueAllowed type MultipleValueNotAllowed = MultipleValueNotAllowed
Now, let's define our type aliases in terms of which groups of variants its data type is compatible with. These read more or less like a plain English description of which variants are allowed:
type alias StringCondition = Condition String ComparisonAllowed SubStringAllowed MultipleValueAllowed type alias IntCondition = Condition Int ComparisonAllowed SubStringNotAllowed MultipleValueAllowed type alias TimeCondition = Condition Time.Posix ComparisonAllowed SubStringNotAllowed MultipleValueNotAllowed type alias BoolCondition = Condition Bool ComparisonNotAllowed SubStringNotAllowed MultipleValueNotAllowed
And now let's fix the type signatures of our constructor functions, using concrete types for the values of the type parameters we want to restrict:
is : value -> Condition value comparison substring multiple is theValue = Is theValue isLessThan : value -> Condition value ComparisonAllowed substring multiple isLessThan theValue = IsLessThan theValue isGreaterThan : value -> Condition value ComparisonAllowed substring multiple isGreaterThan theValue = IsGreaterThan theValue contains : value -> Condition value comparison SubStringAllowed multiple contains theValue = Contains theValue isOneOf : List value -> Condition value comparison substring MultipleValueAllowed isOneOf theValues = IsOneOf theValues
Now these invalid
Conditions will produce type errors:
Things to consider
Keep an eye on your module interface
The guarantees that phantom types offer are not secure unless you pay attention to which parts of your modules you expose.
In this example, we exposed:
- The type aliases
- The constructor functions like
And we did not expose:
- The single-value types like
- The variants of the
This prevented client code from instantiating invalid
Consider developer experience
Although their power to restrict instances of a type can be helpful, phantom types can potentially produce compiler errors that are hard to understand. This is a risk, so consider whether a developer will be able to make sense of them without an understanding of the internals of your module or reading notes in your documentation.
In my case, the constructor functions I exposed included documentation comments describing the constraints and the purpose of the type parameters that a developer would see in error messages.
Phantom types can be helpful when certain instances of a type need to be restricted from use with certain functions, but duplicating the type and its helper functions would be impractical. They're not a tool to reach for regularly, but they can be valuable to keep in mind when other options for maintaining constraints fall short.