Pattern matching syntax and semantics
Table of contents
- Abstract
- Problem
- Background
- Proposal
- Details
- Rationale
- Alternatives considered
- Future work
- Examples
Abstract
This paper proposes concrete syntax and semantic choices for Carbon patterns.
Problem
Carbon uses patterns wherever a value should be given a name, decomposed, or matched against: in function parameters, variable declarations, match
statement case
s, for
statement loop variables, and so on. Simplified forms of patterns, required to be just a simple name binding, appear in additional contexts, such as fields in classes and implicit parameter lists. While we have syntax specified for some of these constructs, we do not have an approved proposal describing the syntax or semantics of patterns.
Background
See pattern matching on wikipedia for a broad overview of the subject.
We refer to the value being matched by a pattern as the scrutinee.
Proposal
Patterns in Carbon are a generalization of the expression grammar. Compared to expressions, patterns add:
- Bindings, of the form
name: type
, which give a name for the scrutinee. var
pattern, which creates a separate object to hold the value of the scrutinee, and causes any nested bindings to be mutable lvalues instead of immutable rvalues.- Additional syntax to make matching against structs more convenient.
Details
Expressions versus proper patterns
Expressions are patterns, as described below. A pattern that is not an expression, because it contains pattern-specific syntax such as a binding, is a proper pattern. Many expression forms, such as arbitrary function calls, are not permitted as proper patterns, so cannot contain bindings.
- pattern ::= proper-pattern
fn F(n: i32) -> i32 { return n; }
match (F(42)) {
// ❌ Error: binding can't appear in a function call.
case (F(n: i32)) => {}
}
Expression patterns
An expression is a pattern.
- pattern ::= expression
The pattern is compared with the expression using the ==
operator: pattern ==
scrutinee.
fn F(n: i32) {
match (n) {
// ✅ Results in an `n == 5` comparison.
// OK despite `n` and `5` having different types.
case 5 => {}
}
}
Any ==
operations performed by a pattern match occur in lexical order, but for repeated matches against the same pattern, later comparisons may be skipped by reusing the result from an earlier comparison:
class ChattyIntMatcher {
external impl as EqWith(i32) {
fn Eq[me: ChattyIntMatcher](other: i32) {
Print("Matching {0}", other);
return other == 1;
}
}
}
fn F() {
// Prints `Matching 1` then `Matching 2`,
// may or may not then print `Matching 1` again.
match ((1, 2)) {
case ({} as ChattyIntMatcher, 0) => {}
case (1, {} as ChattyIntMatcher) => {}
case ({} as ChattyIntMatcher, 2) => {}
}
}
Bindings
Name bindings
A name binding is a pattern.
- binding-pattern ::=
unused
? identifier:
expression - proper-pattern ::= binding-pattern
The type of the identifier is specified by the expression. The scrutinee is implicitly converted to that type if necessary.
fn F() -> i32 {
match (5) (
// ✅ `5` is implicitly converted to `i32`.
// Returns `5 as i32`.
case n: i32 => { return n; }
}
}
When a new object needs to be created for the binding, the lifetime of the bound value matches the scope of the binding.
class NoisyDestructor {
fn Make() -> Self { return {}; }
external impl i32 as ImplicitAs(NoisyDestructor) {
fn Convert[me: i32]() -> Self { return Make(); }
}
destructor {
Print("Destroyed!");
}
}
fn G() {
// Does not print "Destroyed!".
let n: NoisyDestructor = NoisyDestructor.Make();
Print("Body of G");
// Prints "Destroyed!" here.
}
fn H(n: i32) {
// Does not print "Destroyed!".
let (v: NoisyDestructor, w: i32) = (n, n);
Print("Body of H");
// Prints "Destroyed!" here.
}
As specified in #2022, the unused
keyword indicates that the binding is intended to not be used.
Wildcard
A syntax like a binding but with _
in place of an identifier can be used to ignore part of a value.
- binding-pattern ::=
_
:
expression
See #2022 for details.
The behavior is similar to that of an unused
binding with a unique name.
fn F(n: i32) {
match (n) {
// ✅ Matches and discards the value of `n`.
case _: i32 => {}
// ❌ Error: unreachable.
default => {}
}
}
As specified in #1084, function redeclarations may replace named bindings with wildcards but may not use different names.
fn G(n: i32);
fn H(n: i32);
fn J(n: i32);
// ✅ Does not use `n`.
fn G(_: i32) {}
// ❌ Error: name of parameter does not match declaration.
fn H(m: i32) {}
// ✅ Does not use `n`.
fn J(unused n: i32);
Generic bindings
A :!
can be used in place of :
for a binding that is usable at compile time.
- generic-pattern ::=
unused
?template
? identifier:!
expression - generic-pattern ::=
template
?_
:!
expression - proper-pattern ::= generic-pattern
// ✅ `F` takes a generic type parameter `T` and a parameter `x` of type `T`.
fn F(T:! Type, x: T) {
var v: T = x;
}
The template
keyword indicates the binding is introducing a template parameter, so name lookups into the parameter should be deferred until its value is known.
auto
and type deduction
The auto
keyword is a placeholder for a unique deduced type.
- expression ::=
auto
fn F(n: i32) {
var v: auto = SomeComplicatedExpression(n);
// Equivalent to:
var w: T = SomeComplicatedExpression(n);
// ... where `T` is the type of the initializer.
}
The auto
keyword is only permitted in specific contexts. Currently these are:
- As the return type of a function.
- As the type of a binding.
It is anticipated that auto
may be permitted in more contexts in the future, for example as a generic argument in a parameterized type that appears in a context where auto
is allowed, such as Vector(auto)
or auto*
.
When the type of a binding requires type deduction, the type is deduced against the type of the scrutinee and deduced values are substituted back into the type before pattern matching is performed.
fn G[T:! Type](p: T*);
class X { external impl as ImplicitAs(i32*); }
// ✅ Deduces `T = i32` then implicitly and
// trivially converts `p` to `i32*`.
fn H1(p: i32*) { G(p); }
// ❌ Error, can't deduce `T*` from `X`.
fn H2(p: X) { G(p); }
The above is only an illustration; the behavior of type deduction is not specified in this proposal.
var
A var
prefix indicates that a pattern provides mutable storage for the scrutinee.
- proper-pattern ::=
var
proper-pattern
A var
pattern matches when its nested pattern matches. The type of the storage is the resolved type of the nested pattern. Any bindings within the nested pattern refer to portions of the corresponding storage rather than to the scrutinee.
fn F(p: i32*);
fn G() {
match ((1, 2)) {
// `n` is a mutable `i32`.
case (var n: i32, 1) => { F(&n); }
// `n` and `m` are the elements of a mutable `(i32, i32)`.
case var (n: i32, m: i32) => { F(if n then &n else &m); }
}
}
Pattern matching precedes the initialization of the storage for any var
patterns. An introduced variable is only initialized if the complete pattern matches.
class X {
destructor { Print("Destroyed!"); }
}
fn F(x: X) {
match ((x, 1 as i32)) {
case (var y: X, 0) => {}
case (var z: X, 1) => {}
// Prints "Destroyed!" only once, when `z` is destroyed.
}
}
A var
pattern cannot be nested within another var
pattern. The declaration syntax var
pattern =
expresson ;
is equivalent to let
var
pattern =
expression ;
.
Tuple patterns
A tuple of patterns can be used as a pattern.
- tuple-pattern ::=
(
[expression,
]* proper-pattern [,
pattern]*,
?)
- proper-pattern ::= tuple-pattern
A tuple-pattern containing no commas is treated as grouping parens: the contained proper-pattern is matched directly against the scrutinee. Otherwise, the behavior is as follows.
A tuple pattern is matched left-to-right. The scrutinee is required to be of tuple type.
Note that a tuple pattern must contain at least one proper-pattern. Otherwise, it is a tuple-valued expression. However, a tuple pattern and a corresponding tuple-valued expression are matched in the same way because ==
for a tuple compares fields left-to-right.
Struct patterns
A struct can be matched with a struct pattern.
- proper-pattern ::=
{
[field-init,
]* proper-field-pattern [,
field-pattern]*}
- proper-pattern ::=
{
[field-pattern,
]+_
}
- field-init ::= designator
=
expression - proper-field-pattern ::= designator
=
proper-pattern - proper-field-pattern ::= binding-pattern
- field-pattern ::= field-init
- field-pattern ::= proper-field-pattern
A struct pattern resembles a struct literal, with at least one field initialized with a proper pattern:
match ({.a = 1, .b = 2}) {
// Struct literal as an expression pattern.
case {.b = 2, .a = 1} => {}
// Struct pattern.
case {.b = n: i32, .a = m: i32} => {}
}
The scrutinee is required to be of struct type, and to have the same set of field names as the pattern. The pattern is matched left-to-right, meaning that matching is performed in the field order specified in the pattern, not in the field order of the scrutinee. This is consistent with the behavior of matching against a struct-valued expression, where the expression pattern becomes the left operand of the ==
and so determines the order in which ==
comparisons for fields are performed.
In the case where a field will be bound to an identifier with the same name, a shorthand syntax is available: a: T
is synonymous with .a = a: T
.
match ({.a = 1, .b = 2}) {
case {a: i32, b: i32} => { return a + b; }
}
If some fields should be ignored when matching, a trailing , _
can be added to specify this:
match ({.a = 1, .b = 2}) {
case {.a = 1, _} => { return 1; }
case {b: i32, _} => { return b; }
}
This is valid even if all fields are actually named in the pattern.
Alternative patterns
An alternative pattern is used to match one alternative of a choice type.
- proper-pattern ::= callee-expression tuple-pattern
- proper-pattern ::= designator tuple-pattern?
Here, callee-expression is syntactically an expression that is valid as the callee in a function call expression, and an alternative pattern is syntactically a function call expression whose argument list contains at least one proper-pattern.
If a callee-expression is provided, it is required to name a choice type alternative that has a parameter list, and the scrutinee is implicitly converted to that choice type. Otherwise, the scrutinee is required to be of some choice type, and the designator is looked up in that type and is required to name an alternative with a parameter list if and only if a tuple-pattern is specified.
The pattern matches if the active alternative in the scrutinee is the specified alternative, and the arguments of the alternative match the given tuple pattern (if any).
choice Optional(T:! Type) {
None,
Some(T)
}
match (Optional(i32).None) {
// ✅ `.None` resolved to `Optional(i32).None`.
case .None => {}
// ✅ `.Some` resolved to `Optional(i32).Some`.
case .Some(n: i32) => { Print("{0}", n); }
// ❌ Error, no such alternative exists.
case .Other => {}
}
class X {
external impl as ImplicitAs(Optional(i32));
}
match ({} as X) {
// ✅ OK, but expression pattern.
case Optional(i32).None => {}
// ✅ OK, implicitly converts to `Optional(i32)`.
case Optional(i32).Some(n: i32) => { Print("{0}", n); }
}
Note that a pattern of the form Optional(T).None
is an expression pattern and is compared using ==
.
Templates
Any checking of the type of the scrutinee against the type of the pattern that cannot be performed because the type of the scrutinee involves a template parameter is deferred until the template parameter’s value is known. During instantiation, patterns that are not meaningful due to a type error are instead treated as not matching. This includes cases where an ==
fails because of a missing EqWith
implementation.
fn TypeName[template T:! Type](x: T) -> String {
match (x) {
// ✅ OK, the type of `x` is a template parameter.
case _: i32 => { return "int"; }
case _: bool => { return "bool"; }
case _: auto* => { return "pointer"; }
default => { return "unknown"; }
}
}
Cases where the match is invalid for reasons not involving the template parameter are rejected when type-checking the template:
fn MeaninglessMatch[template T:! Type](x: T*) {
match (*x) {
// ✅ OK, `T` could be a tuple.
case (_: auto, _: auto) => {}
default => {}
}
match (x->y) {
// ✅ OK, `T.y` could be a tuple.
case (_: auto, _: auto) => {}
default => {}
}
match (x) {
// ❌ Error, tuple pattern cannot match value of non-tuple type `T*`.
case (_: auto, _: auto) => {}
default => {}
}
}
Guards
We allow case
s within a match
statement to have guards. These are not part of pattern syntax, but instead are specific to case
syntax:
- case ::=
case
pattern [if
expression]?=>
block
A guard indicates that a case
only matches if some predicate holds. The bindings in the pattern are in scope in the guard:
match (x) {
case (m: i32, n: i32) if m + n < 5 => { return m - n; }
}
For consistency, this facility is also available for default
clauses, so that default
remains equivalent to case _: auto
.
Refutability, overlap, usefulness, and exhaustiveness
Some definitions:
- A pattern P is useful in the context of a set of patterns C if there exists a value that P can match that no pattern in C matches.
- A set of patterns C is exhaustive if it matches all possible values. Equivalently, C is exhaustive if the pattern
_: auto
is not useful in the context of C. - A pattern P is refutable if there are values that it does not match, that is, if the pattern
_: auto
is useful in the context of {P}. Equivalently, the pattern P is refutable if the set of patterns {P} is not exhaustive. - A set of patterns C is overlapping if there exists any value that is matched by more than one pattern in C.
For the purpose of these terms, expression patterns that match a constant tuple, struct, or choice value are treated as if they were tuple, struct, or alternative patterns, respectively, and bool
is treated like a choice type. Any expression patterns that remain after applying this rule are considered to match a single value from an infinite set of values so that a set of expression patterns is never exhaustive:
fn IsEven(n: u8) -> bool {
// Not considered exhaustive.
match (n) {
case 0 => { return true; }
case 1 => { return false; }
...
case 255 => { return false; }
}
// Code here is considered to be reachable.
}
fn IsTrue(b: bool) -> bool {
// Considered exhaustive.
match (b) {
case false => { return false; }
case true => { return true; }
}
// Code here is considered to be unreachable.
}
When determining whether a pattern is useful, no attempt is made to determine the value of any guards, and instead a worst-case assumption is made: a guard on that pattern is assumed to evaluate to true and a guard on any pattern in the context set is assumed to evaluate to false.
We will diagnose the following situations:
-
A pattern is not useful in the context of prior patterns. In a
match
statement, this happens if a pattern ordefault
cannot match because all cases it could cover are handled by prior cases or a priordefault
. For example:choice Optional(T:! Type) { None, Some(T) } fn F(a: Optional(i32), b: Optional(i32)) { match ((a, b)) { case (.Some(a: i32), _: auto) => {} // ✅ OK, but only matches values of the form `(None, Some)`, // because `(Some, Some)` is matched by the previous pattern. case (_: auto, .Some(b: i32)) => {} // ✅ OK, matches all remaining values. case (.None, .None) => {} // ❌ Error, this pattern never matches. case (_: auto, _: auto) => {} } }
-
A pattern match is not exhaustive and the program doesn’t explicitly say what to do when no pattern matches. For example:
-
If the patterns in a
match
are not exhaustive and nodefault
is provided.fn F(n: i32) -> i32 { // ❌ Error, this `match` is not exhaustive. match (n) { case 0 => { return 2; } case 1 => { return 3; } case 2 => { return 5; } case 3 => { return 7; } case 4 => { return 11; } } }
-
If a refutable pattern appears in a context where only one pattern can be specified, such as a
let
orvar
declaration, and there is no fallback behavior. This currently includes all pattern matching contexts other thanmatch
statements, but thevar
/let
-else
feature in #1871 would introduce a second context permitting refutable matches, and overloaded functions might introduce a third context.fn F(n: i32) { // ❌ Error, refutable expression pattern `5` used in context // requiring an irrefutable pattern. var 5 = n; } // ❌ Error, refutable expression pattern `5` used in context // requiring an irrefutable pattern. fn G(n: i32, 5);
-
-
When a set of patterns have no ordering or tie-breaker, it is an error for them to overlap unless there is a unique best match for any value that matches more than one pattern. However, this situation does not apply to any current language rule:
- For
match
statements, patterns are matched top-down, so overlap is permitted. - We do not yet have an approved design for overloaded functions, but it is anticipated that declaration order will be used in that case too.
- For a set of
impl
s that match a givenimpl
lookup, argument deduction is used rather than pattern matching, butimpl
s with the same type structure are an error unless amatch_first
declaration is used to order theimpl
s. (This is a pre-existing rule and is unchanged by this proposal.)
- For
Rationale
- Software and language evolution
- The
, _
syntax for struct patterns enables a style where adding a struct member is not a breaking change. - The requirement that matches be exhaustive makes it easier to add new cases to a choice type, by requiring the compiler to detect places where the new value is not handled.
- The
- Code that is easy to read, understand, and write
- Pattern syntax makes it easier to match complex values.
- Modeling pattern syntax after expressions eases the burden of learning a new sub-language for pattern-matching: patterns are an extension of expressions, and expressions are a special case of patterns.
- Requiring exhaustiveness for matches makes control flow easier to understand as there is never a value for which the
match
skips all cases.
- Interoperability with and migration from existing C++ code
- The rules for matching a templated value can be used to replace
if constexpr
in many cases.
- The rules for matching a templated value can be used to replace
Alternatives considered
Shorthand for auto
We could provide a shorter syntax for name: auto
. Proposal #851 considered the following shorthands and decided against using them:
var n: _ = init;
var n = init;
A novel suggestion that avoids some of the disadvantages of those syntaxes would be to use:
var n:= init;
Advantages:
- Shorter syntax for variables with a deduced type.
- Potentially allows removal of the
auto
keyword.
Disadvantages:
- Appears to introduce a
:=
syntax, but that only arises in cases where an initializer immediately follows the name.- Cases such as
var (a:, b:) = my_pair;
would either be invalid or would not use the:=
syntax. - If we accept such cases, there is a risk of grammar ambiguities.
- If we reject such cases, we may still want to keep
auto
around for them, creating inconsistency.
- Cases such as
- Not a complete replacement for
auto
if we want to also allow things likev: Vector(auto)
. C++ doesn’t allow the equivalent syntax currently, but it was part of the Concepts TS and seems likely to return at some point. - No syntactic difference between accidentally omitting a type entirely and requesting type deduction. However, the mistake of omitting a type but retaining the
:
seems unlikely, and the:
followed by the absence of a type is a signal that something is happening, so this seems to be less of a concern than for thevar n = init;
syntax.
See discussion topics 1 and 2.
Struct pattern syntax
We could omit the , _
syntax. This would simplify struct patterns, but at the cost of removing a feature that can be useful for reducing verbosity and making library evolution easier.
We could always allow a struct pattern to match a struct with more fields, without requiring a , _
suffix. This would aid evolution by reducing the cases where adding a field to a struct can be a breaking change, but such cases would still exist. Further, this would make matches against a struct-valued expression inconsistent with matches against a struct pattern.
We could use a different syntax instead of , _
. Other options that were explicitly considered:
, ...
seems visually evocative of “and more stuff”, but risks conflicting with variadic syntax or at least being confusing when used in a variadic context, given that variadics are expected to claim...
for pack expansion., ._
suggests matching a field without specifying a name, but might create an impression of matching just one field, and three low punctuation characters in a row seems to be pushing the limits of readability.
On balance, , _
harmonizes well with the use of _
to introduce a wildcard, without being too visually confusing given that the other use of _
has a following :
.
We could remove the {field: type}
shorthand and require {.field = field: type}
. This may avoid encouraging reusing the field name even when it is not appropriate, and would remove some syntactic sugar that’s not formally necessary. However, we expect this to be a common case whose ergonomics are important.
We could use a different syntax for {field: type}
that is less divergent from other struct syntaxes:
{.field: type}
seems like a contender, but doesn’t work because that is already recognized as a struct type literal. Also, the use of normal binding syntax means that every locally-introduced name is always introduced byname: type
wherename
is not preceded by.
.{.=field: type}
might be a reasonable mnemonic shorthand for{.field = field: type}
, but looks a little surprising. This is probably the best choice if concerns are found with{field: type}
syntax.
Type pattern matching
We could treat type deduction as a form of pattern matching. For example, we could allow
fn F(a: Vector(T:! Type)) -> T { return a[0]; }
fn G() -> i32 {
let v: Vector(i32) = (1, 2, 3);
// Deduces `T = i32`.
return F(v);
}
where the value of T
is determined by pattern-matching Vector(T:! Type)
against the supplied type Vector(i32)
of v
. And symmetrically:
fn H[m: i32, n: i32](k: i32, (m, n)) { return m + n; }
fn I() {
// Deduces `m = 2`, `n = 3`.
H(1, (2, 3));
}
This would ensure consistency between pattern matching and deduction, potentially reducing the number of rules that Carbon developers need to learn.
We find that use of pattern-matching in type position can harm readability. For example, the first of these two examples may be easier to read due to having less nesting:
fn F[T:! Type](x: T);
fn F(x: (T:! Type));
Having distinct syntax for type-level matching and value-level matching helps guide the reader to the correct interpretation, even though the underlying matching process is expected to be similar or identical. As a result, we keep type deduction syntax and pattern matching syntax separate for now:
- In pattern matching, bindings and wildcards are introduced by nested
:
/:!
patterns, and the right-hand side of a binding pattern is never a proper pattern. - In type deduction, deduced values are specified separately and an expression written in terms of those bindings describes the type.
There are some contexts where there is no syntactic location for introducing deduced values, such as in case
labels. For syntactic consistency, such cases should be addressed by adding forall
syntax, if there is motivation to support deduction:
match (templated_value) {
case forall [template T:! Type] (p: T*) => { heap.Delete(p); }
}
Introducer syntax for expression patterns
We could have some separate introducer syntax to distinguish expression patterns from other kinds of patterns:
match ((a, b)) {
case is (1, 2) => {}
case (is 3, n: i32) => {}
case (m: i32, is 4) => {}
}
This would reduce the chance of confusion in cases where an expression and a similar-looking pattern are treated differently:
class TupleLike {
external impl as (i32, i32);
}
fn MatchTupleLike(t: TupleLike) {
match (t) {
// ✅ OK, expression pattern;
// `t` implicitly converted to `(i32, i32)` by
// built-in `impl EqWith` for tuples.
case (1, 2) => {}
// ❌ Error, `t` is not a tuple.
case (n: i32, 3) => {}
}
}
This would also permit the same, or overlapping, syntax to be used in expressions and patterns with different meanings, for example:
fn F() {
// Here, `[...]` is an array type.
var array: [i32; 5];
match (&array) {
// Here, `[...]` could introduce deduced parameters.
case [T:! Type] x: T* => {}
}
}
Similarly, if we added a &PATT
pattern to match the address of a value, a construct like &n
would be ambiguous without in introducer:
var n: i32 = 5;
fn MatchPointerOrPointee(p: i32*) {
match (p) {
case is &n => {
Print("given pointer to n");
}
// Here, the pattern `&PATT` could match the address of
// a value that matches `PATT`.
case &(is n) => {
Print("given pointer to i32 whose value equals the value of n");
}
case &(m: i32) => {
Print("given pointer to value {0}", m);
}
}
}
Such an introducer would also denote the portions of a pattern that are evaluated at runtime rather than at compile time, benefitting Carbon’s goals of readability and of predictable performance, and would also add syntactic separation between the parts of a pattern that have full exhaustiveness checking and the parts that do not.
However, this would also introduce additional ceremony for the common case where part of a pattern is a specific value. This could be mitigated by permitting certain kinds of value as patterns without an introducer, such as numeric literals and true
and false
, at the cost of introducing more complexity and more confusion over which cases require is
and which do not.
We have also not identified a good choice for the introducer syntax, should we pursue this direction. is
is not an ideal choice, because elsewhere in Carbon syntax, it is a relation between a value and its type, so is T
may be misread as matching values whose type is T
. ==
expression has been suggested, but that would imply the opposite operand order of that in this proposal – scrutinee ==
expression rather than expression ==
scrutinee – which would compare struct fields in a surprising order that diverges from the order of comparison for a struct pattern. A prefix ==
may also be visually surprising.
For now, we do not add such an introducer, but this decision is expected to be revisited by a future proposal.
See:
Allow guards on arbitrary patterns
We could treat guards as part of pattern syntax instead of as part of case
syntax. However, since guards make a pattern refutable, this wouldn’t allow them anywhere other than in case
s in the current language design. It would allow them to be nested within cases:
match (x) {
case (n: i32 if n > 5, "some string") => { ... }
}
Such nesting might allow an expensive later check to be avoided. For example, in the above case we can avoid an ==
comparison on a string if a cheaper comparison of n > 5
fails. However, this would introduce complexity into the grammar, and it’s not clear that this feature would add sufficient value to justify that complexity.
An additional concern is that if we add let
…else
syntax, this would presumably permit things like:
let n: i32 if n > 5 = 20 else { return 0; };
… where it would be easy to misparse the if ... else
as being a single construct, where the intended parse would be:
let ((n: i32) if n > 5) = 20 else { return 0; };
See also a related Discord discussion.
Treat expression patterns as exhaustive if they cover all possible values
We could do more work to treat a set of expression patterns as being exhaustive, if each pattern has a constant value and between those constant values, all possible values of the type are covered. The advantage of this would be that we improve the precision of our language rules.
This change in rules has some disadvantages and problems:
- It would add some complexity to the rules and to implementations in order to track whether all possible values have been created.
- For even the simplest types where this would apply, such as
i8
, it seems unlikely that amatch
covering all possible values would be written, due to the large number of patterns required. - In many cases, the value being matched will carry an invariant so that matching a subset of the representable values would match all meaningful values. We would still have imprecise rules in those cases.
- Expression patterns are matched with
==
, and for an arbitrary==
it is not computable in general to tell whether a given set of values is exhaustive, so we would only be able to apply this in some subset of cases.
This restriction is straightforward to work around by adding a final unreachable default
, or by replacing the final value match with a default
or _
.
Allow non-exhaustive match
statements
We could permit match
statements that are not exhaustive, and execute none of the case blocks if none of the patterns match. This is a very common choice in the design of such language features. However, it is also a common source of errors, for example when matching a sum type and an alternative is missed. By making this a language rule, we ensure that developers can rely on such mistakes being caught.
There is an easy syntactic way to disable this check, by adding an explicit default => {}
case. If we made the opposite choice, we would not automatically have an easy way to request that the check be performed. We could add syntax to request it, but such syntax would likely be forgotten and the default behavior would be that errors are silently permitted. This seems sufficient to outweigh the potential ergonomic cost of requiring the default => {}
case to be written explicitly.
Another motivation for requiring exhaustiveness is that it simplifies other language rules. For example, when determining whether control flow can reach the end of a function with a declared return type, a separate exhaustiveness analysis is not necesasry.
One concern with exhaustiveness checking is that it will cause the addition of an alternative to a choice type to be a breaking change by default. However, this is also one of the main advantages, and the design of choice types is intended to eventually provide a mechanism to specify that a choice type is extensible, which if used would mean that a set of patterns for that choice type would only be considered exhaustive if it includes a wildcard pattern.
Future work
Or patterns
We could provide “or patterns”, allowing matching of one pattern or another with the same handler:
match (x) {
case (m: i32, 0) | (0, m: i32) => { return m; }
}
See the red-black tree rebalancing example for a real-world example where this would result in a simplification.
As patterns
We could provide “as-patterns” as a convenient way to give a name to a value while still matching parts of that value. Following Haskell, we could use:
- pattern ::= identifier
@
pattern
For example:
match (x) {
// `s` names the first element of the tuple.
case (s@{.a = n: i32, .b = 12}, 4) if n > 1 => { return s; }
}
Matching classes by dynamic type
We could provide a way to match polymorphic class objects based on their dynamic type. This might be the default when matching a polymorphic class, or might require opt-in. The behavior in this proposal is that only the static type of the operand is considered.
For example, we could default to matching the static type, and allow a dyn
pattern syntax for matching a pointer to a polymorphic class type, meaning that we match the dynamic type rather than the static type of the pointer:
abstract class Base { virtual fn F[me: Self](); }
class Derived1 extends Base {}
class Derived2 extends Base {}
fn PrintType(b: Base*) {
match (b) {
// `case d1: Derived*` would be invalid here,
// because it could never match.
case dyn d1: Derived1* => { Print("Derived1"); }
case dyn d2: Derived2* => { Print("Derived2"); }
default => { Print("Unknown derived class"); }
}
}
fn PrintTemplateType[template T:! Type](p: T*) {
match (p) {
// OK, dispatch is based on the static type.
case b: Base* => { Print("Base"); }
case d1: Derived1* => { Print("Derived1"); }
case d2: Derived2* => { Print("Derived2"); }
default => { Print("Unknown class"); }
}
}
However, at this time we do not have a design for a checked down-cast, so it’s not clear how this matching operation would fit into the design of classes, either syntactically or semantically.
User-defined pattern matching
We plan to provide a mechanism for allowing a user-defined type to specify how it can be matched by patterns. See proposal #157 for details.
Matching classes with struct patterns
We could allow a class to be matched by a struct pattern that matches its fields. This would make sense especially for data classes, and would be consistent with the behavior of ==
in the case where a struct is implicitly convertible to the class type. However, without a design for user-defined pattern matching and matching on dynamic type, there is a significant risk that this would conflict with the rules there, so it is deferred for now.
Matching by reference
When the scrutinee is an lvalue, it is sometimes desirable to form a mutable binding to it. For example, see the struct
inspection example below. We currently support this only for the me
binding in a method, using the addr
keyword, but could allow this more generally.
fn takeDamage(p: player*) {
match (*p) {
case {.hitpoints = 0, .lives = 0, _} => { gameOver(); }
case {.hitpoints = addr hp: i32*, .lives = addr l: i32*, _} if *hp == 0 => {
*hp = 10;
--*l;
}
case {.hitpoints = addr hp: i32*, _} if *hp <= 3 => {
--*hp;
messageAlmostDead();
}
case {.hitpoints = addr hp: i32*, _} => {
--*hp;
}
}
}
Work in this area will need to consider whether we can provide this feature ergonomically without introducing reference-like behavior.
Type deduction
This proposal does not cover type deduction, instead considering it to be a separate topic from pattern matching syntax, even though the semantic behavior of the two may be quite similar or identical.
We will need a proposal to explore type deduction and describe its functioning.
Match expressions
As demonstrated in the switching an enum example below, it would be valuable to have an expression match
syntax in addition to the statement match
syntax. We could follow the same approach as for if
statements, and say that a match
that appears at the start of a statement is a statement match
and any other match
is an expression match
.
As candidate syntax, braced case
bodies could be replaced by an expression followed by a comma. For example:
let opengl_color: Vec3 = match (c) {
case .red => Vec3.Make(1.0, 0.0, 0.0),
case .yellow => Vec3.Make(1.0, 1.0, 0.0),
case .green => Vec3.Make(0.0, 1.0, 0.0),
case .blue => Vec3.Make(0.0, 0.0, 1.0)
};
Examples
These examples are translations of examples in WG21 paper P0095R1 and P1371R3, with permission from the author of those examples, David Sankel. Thank you, David!
Examples from P0095R1
Figure 1. Declaration of a command data structure
C++ | P0095R1 | This proposal |
---|---|---|
|
|
|
Figure 2. Implementation of an output operator
C++ | P0095R1 | This proposal |
---|---|---|
|
|
|
Figure 3. Switching an enum
C++ | P0095R1 | This proposal |
---|---|---|
|
| |
|
|
|
Figure 4. Expression datatype
C++ | P0095R1 | This proposal |
---|---|---|
|
|
|
Figure 5. struct
inspection
C++ | P0095R1 | This proposal |
---|---|---|
|
| |
|
|
|
Example from P1371R5: Red-black tree rebalancing
With P1371R5 pattern matching
enum Color { Red, Black };
template <typename T>
struct Node {
void balance();
Color color;
std::shared_ptr<Node> lhs;
T value;
std::shared_ptr<Node> rhs;
};
template <typename T>
void Node<T>::balance() {
*this = inspect (*this) {
[case Black, (*?) [case Red, (*?) [case Red, a, x, b], y, c], z, d]
=> Node{Red, std::make_shared<Node>(Black, a, x, b),
y,
std::make_shared<Node>(Black, c, z, d)};
[case Black, (*?) [case Red, a, x, (*?) [case Red, b, y, c]], z, d] // left-right case
=> Node{Red, std::make_shared<Node>(Black, a, x, b),
y,
std::make_shared<Node>(Black, c, z, d)};
[case Black, a, x, (*?) [case Red, (*?) [case Red, b, y, c], z, d]] // right-left case
=> Node{Red, std::make_shared<Node>(Black, a, x, b),
y,
std::make_shared<Node>(Black, c, z, d)};
[case Black, a, x, (*?) [case Red, b, y, (*?) [case Red, c, z, d]]] // right-right case
=> Node{Red, std::make_shared<Node>(Black, a, x, b),
y,
std::make_shared<Node>(Black, c, z, d)};
self => self;
};
}
With this proposal
choice Color { Red, Black }
class Node(T:! Type) {
fn balance[addr me: Self*]();
var color: Color;
var lhs: SharedPtr(Node);
var value: T;
var rhs: SharedPtr(Node);
}
fn MakeBalanced[T:! Type](a: SharedPtr(Node(T)), x: T,
b: SharedPtr(Node(T)), y: T,
c: SharedPtr(Node(T)), z: T,
d: SharedPtr(Node(T))) -> Node(T) {
return {.color = Color.Red,
.lhs = MakeShared({.color = Color.Black, .lhs = a, .value = x, .rhs = b}),
.value = y,
.rhs = MakeShared({.color = Color.Black, .lhs = c, .value = z, .rhs = d})};
}
fn Node(T:! Type).balance[addr me: Self*]() {
match (*me) {
{.color = .Black, .lhs = .PtrTo(
{.color = .Red, .lhs = .PtrTo(
{.color = .Red, .lhs = a: auto, .value = x: T, .rhs = b: auto}),
.value = y: T, .rhs = c: auto}),
.value = z: T, .rhs = d: auto} => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
{.color = .Black, .lhs = .PtrTo(
{.color = .Red, .lhs = a: auto, .value = x: T, .rhs = .PtrTo(
{.color = .Red, .lhs = b: auto, .value = y: T, .rhs = c: auto})}),
.value = z: T, .rhs = d: auto} => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
{.color = .Black, .lhs = a: auto, .value = x: T, .rhs = .PtrTo(
{.color = .Red, .lhs = .PtrTo(
{.color = .Red, .lhs = b: auto, .value = y: T, .rhs = c: auto}),
.value = z: T, .rhs = d: auto})} => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
{.color = .Black, .lhs = a: auto, .value = x: T, .rhs = .PtrTo(
{.color = .Red, .lhs = b: auto, .value = y: T, .rhs = .PtrTo(
{.color = .Red, .lhs = c: auto, .value = z: T, .rhs = d: auto})})} => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
default => {}
};
}
With this proposal plus #2187
choice Color { Red, Black }
class Node(T:! Type) {
fn balance[addr me: Self*]();
var color: Color;
var lhs: SharedPtr(Self);
var value: T;
var rhs: SharedPtr(Self);
external impl as Match {
interface Continuation {
extends Match.BaseContinuation;
fn Red[addr me: Self*](lhs: SharedPtr(Self), value: T, rhs: SharedPtr(Self)) -> ReturnType;
fn Black[addr me: Self*](lhs: SharedPtr(Self), value: T, rhs: SharedPtr(Self)) -> ReturnType;
}
fn Op[me: Self, C:! Continuation](continuation: C*) -> C.ReturnType {
match (me.color) {
case .Red => { return continuation->Red(me.lhs, me.value, me.rhs); }
case .Black => { return continuation->Black(me.lhs, me.value, me.rhs); }
}
}
}
}
fn MakeBalanced[T:! Type](a: SharedPtr(Node(T)), x: T,
b: SharedPtr(Node(T)), y: T,
c: SharedPtr(Node(T)), z: T,
d: SharedPtr(Node(T))) -> Node(T) {
return {.color = Color.Red,
.lhs = MakeShared({.color = Color.Black, .lhs = a, .value = x, .rhs = b}),
.value = y,
.rhs = MakeShared({.color = Color.Black, .lhs = c, .value = z, .rhs = d})};
}
fn Node(T:! Type).balance[addr me: Self*]() {
match (*me) {
.Black(.PtrTo(.Red(.PtrTo(.Red(a: auto, x: T, b: auto)), y: T, c: auto)), z: T, d: auto) => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
.Black(.PtrTo(.Red(a: auto, x: T, .PtrTo(.Red(b: auto, y: T, c: auto)))), z: T, d: auto) => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
.Black(a: auto, x: T, .PtrTo(.Red(.PtrTo(.Red(b: auto, y: T, c: auto)), z: T, d: auto))) => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
.Black(a: auto, x: T, .PtrTo(.Red(b: auto, y: T, .PtrTo(.Red(c: auto, z: T, d: auto))))) => {
*me = MakeBalanced(a, x, b, y, c, z, d);
}
default => {}
};
}