Polymorphic union refinement
Untagged union types are types of the form that contain values that are either of type or of type , with no "tag" information to denote which is which:
Operationally, there is no obvious way to use an untagged union type:
unlike a disjoint union type, which contains an additional tag bit,
there is no general-purpose match
expression that can extract either
the or the .
For languages with runtime type introspection, some systems of refinement types enable consumption of union types. If is found via a runtime type test to not be of type , then it may be used at type . For instance, the following programs are valid Flow and also valid TypeScript:
function numberwang(x: number | string): string {
if (typeof x === "number") {
// in this block, `x: number`
return `half of ${x * 2}`;
} else {
// in this block, `x: string`
return `a string with ${x.length} characters`;
}
}
class Dog {
name: string;
constructor(name: string) { this.name = name; }
}
class Car {
wheels: number;
constructor(wheels: number) { this.wheels = wheels; }
}
function description(x: Dog | Car): string {
if (x instanceof Dog) {
// in this block, `x: Dog`
return `the dog ${x.name}`;
} else {
// in this block, `x: Car`
return `a car with ${x.wheels} wheels`;
}
}
However, runtime type tests are not always so simple. The following code
also typechecks in both Flow and TypeScript, yet does not necessarily
return an Array<T>
1:
function orSingleton<T>(x: T | Array<T>): Array<T> {
if (x instanceof Array) {
return x;
}
return [x];
}
While x instanceof Array
does imply that x: Array<E>
for some E
,
it does not imply specifically that x: Array<T>
, even when it is known
that x: T | Array<T>
. The problem is that T
could itself be
Array<U>
for some type U
, so x: Array<U> | Array<Array<U>>
is an
instance of Array
in either case:
// Purports to be the identity function, but fails when `x` happens to
// be an array.
function id<T>(x: T): T {
const singletonArray: T[] = orSingleton(x);
return singletonArray[0];
}
const digits: number[] = id([3, 1, 4]);
// unsound: `digits` is actually `3`
digits.includes(4); // TypeError: digits.includes is not a function
In some sense, the core issue here is the refinement itself. A runtime
check that x instanceof Array
says nothing about the element type of
the array, just as a runtime check that typeof x === "function"
says
nothing about the domain or codomain. The untagged union type offers
a way to exploit this flaw. The same flaw occurred in Scala2:
// Counterexample by Lionel Parreaux
def test[A]: List[Int] | A => Int =
{ case ls: List[Int] => ls.head case _ => 0 }
test(List("oops")) // crashes
By contrast, Typed Racket also supports refinements of union types, but behaves correctly here:
#lang typed/racket
(: valid-refinement (All (T) (U Number (Boxof T)) (-> Number T) -> T))
(define (valid-refinement x from-number)
(if (box? x)
(unbox x) ;; works: must be a `Boxof T` (good)
(from-number x)))
(: invalid-refinement (All (T) (U T (Boxof T)) -> T))
(define (invalid-refinement x)
(if (box? x)
(unbox x) ;; type error here: could be a box of something else (good!)
x))
In Flow and TypeScript, this issue is resolved by using tagged (disjoint) unions instead of untagged unions:
type ElementOrArray<T> =
{type: "ELEMENT", value: T} | {type: "ARRAY", array: T[]};
function orSingleton<T>(x: ElementOrArray<T>): T[] {
if (x.type === "ARRAY") {
return x.array;
}
return [x.value];
}
// Actually the identity function.
function id<T>(x: T): T {
const singletonArray: T[] = orSingleton({type: "ELEMENT", value: x});
return singletonArray[0];
}
Since each variant of ElementOrArray<T>
includes a distinct type
tag, there is no ambiguity in orSingleton
, and the required changes to
id
fix the bug.
In JavaScript, all values are in fact tagged as one of "string"
,
"number"
, "object"
, or a few other options, and the typeof
operator exposes this tag at runtime. Thus, "untagged" unions like
number | null
or boolean | string
, where every variant has a
distinct value tag, are actually tagged unions, and may still be used
safely.
Refinement with instanceof
and generic unions is unsound, Flow issue #6741 (2018)