Polymorphic union refinement
Untagged union types are types of the form that contain values that are either of type or of type , with no "tag" information to denote which is which:
Operationally, there is no obvious way to use an untagged union type:
unlike a disjoint union type, which contains an additional tag bit,
there is no general-purpose match expression that can extract either
the or the .
For languages with runtime type introspection, some systems of refinement types enable consumption of union types. If is found via a runtime type test to not be of type , then it may be used at type . For instance, the following programs are valid Flow and also valid TypeScript:
function numberwang(x: number | string): string {
if (typeof x === "number") {
// in this block, `x: number`
return `half of ${x * 2}`;
} else {
// in this block, `x: string`
return `a string with ${x.length} characters`;
}
}
class Dog {
name: string;
constructor(name: string) { this.name = name; }
}
class Car {
wheels: number;
constructor(wheels: number) { this.wheels = wheels; }
}
function description(x: Dog | Car): string {
if (x instanceof Dog) {
// in this block, `x: Dog`
return `the dog ${x.name}`;
} else {
// in this block, `x: Car`
return `a car with ${x.wheels} wheels`;
}
}
However, runtime type tests are not always so simple. The following code
also typechecks in both Flow and TypeScript, yet does not necessarily
return an Array<T>1:
function orSingleton<T>(x: T | Array<T>): Array<T> {
if (x instanceof Array) {
return x;
}
return [x];
}
While x instanceof Array does imply that x: Array<E> for some E,
it does not imply specifically that x: Array<T>, even when it is known
that x: T | Array<T>. The problem is that T could itself be
Array<U> for some type U, so x: Array<U> | Array<Array<U>> is an
instance of Array in either case:
// Purports to be the identity function, but fails when `x` happens to
// be an array.
function id<T>(x: T): T {
const singletonArray: T[] = orSingleton(x);
return singletonArray[0];
}
const digits: number[] = id([3, 1, 4]);
// unsound: `digits` is actually `3`
digits.includes(4); // TypeError: digits.includes is not a function
In some sense, the core issue here is the refinement itself. A runtime
check that x instanceof Array says nothing about the element type of
the array, just as a runtime check that typeof x === "function" says
nothing about the domain or codomain. The untagged union type offers
a way to exploit this flaw. The same flaw occurred in Scala2:
// Counterexample by Lionel Parreaux
def test[A]: List[Int] | A => Int =
{ case ls: List[Int] => ls.head case _ => 0 }
test(List("oops")) // crashes
By contrast, Typed Racket also supports refinements of union types, but behaves correctly here:
#lang typed/racket
(: valid-refinement (All (T) (U Number (Boxof T)) (-> Number T) -> T))
(define (valid-refinement x from-number)
(if (box? x)
(unbox x) ;; works: must be a `Boxof T` (good)
(from-number x)))
(: invalid-refinement (All (T) (U T (Boxof T)) -> T))
(define (invalid-refinement x)
(if (box? x)
(unbox x) ;; type error here: could be a box of something else (good!)
x))
In Flow and TypeScript, this issue is resolved by using tagged (disjoint) unions instead of untagged unions:
type ElementOrArray<T> =
{type: "ELEMENT", value: T} | {type: "ARRAY", array: T[]};
function orSingleton<T>(x: ElementOrArray<T>): T[] {
if (x.type === "ARRAY") {
return x.array;
}
return [x.value];
}
// Actually the identity function.
function id<T>(x: T): T {
const singletonArray: T[] = orSingleton({type: "ELEMENT", value: x});
return singletonArray[0];
}
Since each variant of ElementOrArray<T> includes a distinct type
tag, there is no ambiguity in orSingleton, and the required changes to
id fix the bug.
In JavaScript, all values are in fact tagged as one of "string",
"number", "object", or a few other options, and the typeof
operator exposes this tag at runtime. Thus, "untagged" unions like
number | null or boolean | string, where every variant has a
distinct value tag, are actually tagged unions, and may still be used
safely.
Refinement with instanceof and generic unions is unsound, Flow issue #6741 (2018)