Pattern matching syntax and semantics

Pull request

Table of contents

Abstract

This paper proposes concrete syntax and semantic choices for Carbon patterns.

Problem

Carbon uses patterns wherever a value should be given a name, decomposed, or matched against: in function parameters, variable declarations, match statement cases, for statement loop variables, and so on. Simplified forms of patterns, required to be just a simple name binding, appear in additional contexts, such as fields in classes and implicit parameter lists. While we have syntax specified for some of these constructs, we do not have an approved proposal describing the syntax or semantics of patterns.

Background

See pattern matching on wikipedia for a broad overview of the subject.

We refer to the value being matched by a pattern as the scrutinee.

Proposal

Patterns in Carbon are a generalization of the expression grammar. Compared to expressions, patterns add:

  • Bindings, of the form name: type, which give a name for the scrutinee.
  • var pattern, which creates a separate object to hold the value of the scrutinee, and causes any nested bindings to be mutable lvalues instead of immutable rvalues.
  • Additional syntax to make matching against structs more convenient.

Details

Expressions versus proper patterns

Expressions are patterns, as described below. A pattern that is not an expression, because it contains pattern-specific syntax such as a binding, is a proper pattern. Many expression forms, such as arbitrary function calls, are not permitted as proper patterns, so cannot contain bindings.

  • pattern ::= proper-pattern
fn F(n: i32) -> i32 { return n; }

match (F(42)) {
  // ❌ Error: binding can't appear in a function call.
  case (F(n: i32)) => {}
}

Expression patterns

An expression is a pattern.

  • pattern ::= expression

The pattern is compared with the expression using the == operator: pattern == scrutinee.

fn F(n: i32) {
  match (n) {
    // ✅ Results in an `n == 5` comparison.
    // OK despite `n` and `5` having different types.
    case 5 => {}
  }
}

Any == operations performed by a pattern match occur in lexical order, but for repeated matches against the same pattern, later comparisons may be skipped by reusing the result from an earlier comparison:

class ChattyIntMatcher {
  external impl as EqWith(i32) {
    fn Eq[me: ChattyIntMatcher](other: i32) {
      Print("Matching {0}", other);
      return other == 1;
    }
  }
}

fn F() {
  // Prints `Matching 1` then `Matching 2`,
  // may or may not then print `Matching 1` again.
  match ((1, 2)) {
    case ({} as ChattyIntMatcher, 0) => {}
    case (1, {} as ChattyIntMatcher) => {}
    case ({} as ChattyIntMatcher, 2) => {}
  }
}

Bindings

Name bindings

A name binding is a pattern.

  • binding-pattern ::= unused? identifier : expression
  • proper-pattern ::= binding-pattern

The type of the identifier is specified by the expression. The scrutinee is implicitly converted to that type if necessary.

fn F() -> i32 {
  match (5) (
    // ✅ `5` is implicitly converted to `i32`.
    // Returns `5 as i32`.
    case n: i32 => { return n; }
  }
}

When a new object needs to be created for the binding, the lifetime of the bound value matches the scope of the binding.

class NoisyDestructor {
  fn Make() -> Self { return {}; }
  external impl i32 as ImplicitAs(NoisyDestructor) {
    fn Convert[me: i32]() -> Self { return Make(); }
  }
  destructor {
    Print("Destroyed!");
  }
}

fn G() {
  // Does not print "Destroyed!".
  let n: NoisyDestructor = NoisyDestructor.Make();
  Print("Body of G");
  // Prints "Destroyed!" here.
}

fn H(n: i32) {
  // Does not print "Destroyed!".
  let (v: NoisyDestructor, w: i32) = (n, n);
  Print("Body of H");
  // Prints "Destroyed!" here.
}

As specified in #2022, the unused keyword indicates that the binding is intended to not be used.

Wildcard

A syntax like a binding but with _ in place of an identifier can be used to ignore part of a value.

  • binding-pattern ::= _ : expression

See #2022 for details.

The behavior is similar to that of an unused binding with a unique name.

fn F(n: i32) {
  match (n) {
    // ✅ Matches and discards the value of `n`.
    case _: i32 => {}
    // ❌ Error: unreachable.
    default => {}
  }
}

As specified in #1084, function redeclarations may replace named bindings with wildcards but may not use different names.

fn G(n: i32);
fn H(n: i32);
fn J(n: i32);

// ✅ Does not use `n`.
fn G(_: i32) {}
// ❌ Error: name of parameter does not match declaration.
fn H(m: i32) {}
// ✅ Does not use `n`.
fn J(unused n: i32);

Generic bindings

A :! can be used in place of : for a binding that is usable at compile time.

  • generic-pattern ::= unused? template? identifier :! expression
  • generic-pattern ::= template? _ :! expression
  • proper-pattern ::= generic-pattern
// ✅ `F` takes a generic type parameter `T` and a parameter `x` of type `T`.
fn F(T:! Type, x: T) {
  var v: T = x;
}

The template keyword indicates the binding is introducing a template parameter, so name lookups into the parameter should be deferred until its value is known.

auto and type deduction

The auto keyword is a placeholder for a unique deduced type.

  • expression ::= auto
fn F(n: i32) {
  var v: auto = SomeComplicatedExpression(n);
  // Equivalent to:
  var w: T = SomeComplicatedExpression(n);
  // ... where `T` is the type of the initializer.
}

The auto keyword is only permitted in specific contexts. Currently these are:

  • As the return type of a function.
  • As the type of a binding.

It is anticipated that auto may be permitted in more contexts in the future, for example as a generic argument in a parameterized type that appears in a context where auto is allowed, such as Vector(auto) or auto*.

When the type of a binding requires type deduction, the type is deduced against the type of the scrutinee and deduced values are substituted back into the type before pattern matching is performed.

fn G[T:! Type](p: T*);
class X { external impl as ImplicitAs(i32*); }
// ✅ Deduces `T = i32` then implicitly and
// trivially converts `p` to `i32*`.
fn H1(p: i32*) { G(p); }
// ❌ Error, can't deduce `T*` from `X`.
fn H2(p: X) { G(p); }

The above is only an illustration; the behavior of type deduction is not specified in this proposal.

var

A var prefix indicates that a pattern provides mutable storage for the scrutinee.

  • proper-pattern ::= var proper-pattern

A var pattern matches when its nested pattern matches. The type of the storage is the resolved type of the nested pattern. Any bindings within the nested pattern refer to portions of the corresponding storage rather than to the scrutinee.

fn F(p: i32*);
fn G() {
  match ((1, 2)) {
    // `n` is a mutable `i32`.
    case (var n: i32, 1) => { F(&n); }
    // `n` and `m` are the elements of a mutable `(i32, i32)`.
    case var (n: i32, m: i32) => { F(if n then &n else &m); }
  }
}

Pattern matching precedes the initialization of the storage for any var patterns. An introduced variable is only initialized if the complete pattern matches.

class X {
  destructor { Print("Destroyed!"); }
}
fn F(x: X) {
  match ((x, 1 as i32)) {
    case (var y: X, 0) => {}
    case (var z: X, 1) => {}
    // Prints "Destroyed!" only once, when `z` is destroyed.
  }
}

A var pattern cannot be nested within another var pattern. The declaration syntax var pattern = expresson ; is equivalent to let var pattern = expression ;.

Tuple patterns

A tuple of patterns can be used as a pattern.

  • tuple-pattern ::= ( [expression ,]* proper-pattern [, pattern]* ,? )
  • proper-pattern ::= tuple-pattern

A tuple-pattern containing no commas is treated as grouping parens: the contained proper-pattern is matched directly against the scrutinee. Otherwise, the behavior is as follows.

A tuple pattern is matched left-to-right. The scrutinee is required to be of tuple type.

Note that a tuple pattern must contain at least one proper-pattern. Otherwise, it is a tuple-valued expression. However, a tuple pattern and a corresponding tuple-valued expression are matched in the same way because == for a tuple compares fields left-to-right.

Struct patterns

A struct can be matched with a struct pattern.

  • proper-pattern ::= { [field-init ,]* proper-field-pattern [, field-pattern]* }
  • proper-pattern ::= { [field-pattern ,]+ _ }
  • field-init ::= designator = expression
  • proper-field-pattern ::= designator = proper-pattern
  • proper-field-pattern ::= binding-pattern
  • field-pattern ::= field-init
  • field-pattern ::= proper-field-pattern

A struct pattern resembles a struct literal, with at least one field initialized with a proper pattern:

match ({.a = 1, .b = 2}) {
  // Struct literal as an expression pattern.
  case {.b = 2, .a = 1} => {}
  // Struct pattern.
  case {.b = n: i32, .a = m: i32} => {}
}

The scrutinee is required to be of struct type, and to have the same set of field names as the pattern. The pattern is matched left-to-right, meaning that matching is performed in the field order specified in the pattern, not in the field order of the scrutinee. This is consistent with the behavior of matching against a struct-valued expression, where the expression pattern becomes the left operand of the == and so determines the order in which == comparisons for fields are performed.

In the case where a field will be bound to an identifier with the same name, a shorthand syntax is available: a: T is synonymous with .a = a: T.

match ({.a = 1, .b = 2}) {
  case {a: i32, b: i32} => { return a + b; }
}

If some fields should be ignored when matching, a trailing , _ can be added to specify this:

match ({.a = 1, .b = 2}) {
  case {.a = 1, _} => { return 1; }
  case {b: i32, _} => { return b; }
}

This is valid even if all fields are actually named in the pattern.

Alternative patterns

An alternative pattern is used to match one alternative of a choice type.

  • proper-pattern ::= callee-expression tuple-pattern
  • proper-pattern ::= designator tuple-pattern?

Here, callee-expression is syntactically an expression that is valid as the callee in a function call expression, and an alternative pattern is syntactically a function call expression whose argument list contains at least one proper-pattern.

If a callee-expression is provided, it is required to name a choice type alternative that has a parameter list, and the scrutinee is implicitly converted to that choice type. Otherwise, the scrutinee is required to be of some choice type, and the designator is looked up in that type and is required to name an alternative with a parameter list if and only if a tuple-pattern is specified.

The pattern matches if the active alternative in the scrutinee is the specified alternative, and the arguments of the alternative match the given tuple pattern (if any).

choice Optional(T:! Type) {
  None,
  Some(T)
}

match (Optional(i32).None) {
  // ✅ `.None` resolved to `Optional(i32).None`.
  case .None => {}
  // ✅ `.Some` resolved to `Optional(i32).Some`.
  case .Some(n: i32) => { Print("{0}", n); }
  // ❌ Error, no such alternative exists.
  case .Other => {}
}

class X {
  external impl as ImplicitAs(Optional(i32));
}

match ({} as X) {
  // ✅ OK, but expression pattern.
  case Optional(i32).None => {}
  // ✅ OK, implicitly converts to `Optional(i32)`.
  case Optional(i32).Some(n: i32) => { Print("{0}", n); }
}

Note that a pattern of the form Optional(T).None is an expression pattern and is compared using ==.

Templates

Any checking of the type of the scrutinee against the type of the pattern that cannot be performed because the type of the scrutinee involves a template parameter is deferred until the template parameter’s value is known. During instantiation, patterns that are not meaningful due to a type error are instead treated as not matching. This includes cases where an == fails because of a missing EqWith implementation.

fn TypeName[template T:! Type](x: T) -> String {
  match (x) {
    // ✅ OK, the type of `x` is a template parameter.
    case _: i32 => { return "int"; }
    case _: bool => { return "bool"; }
    case _: auto* => { return "pointer"; }
    default => { return "unknown"; }
  }
}

Cases where the match is invalid for reasons not involving the template parameter are rejected when type-checking the template:

fn MeaninglessMatch[template T:! Type](x: T*) {
  match (*x) {
    // ✅ OK, `T` could be a tuple.
    case (_: auto, _: auto) => {}
    default => {}
  }
  match (x->y) {
    // ✅ OK, `T.y` could be a tuple.
    case (_: auto, _: auto) => {}
    default => {}
  }
  match (x) {
    // ❌ Error, tuple pattern cannot match value of non-tuple type `T*`.
    case (_: auto, _: auto) => {}
    default => {}
  }
}

Guards

We allow cases within a match statement to have guards. These are not part of pattern syntax, but instead are specific to case syntax:

  • case ::= case pattern [if expression]? => block

A guard indicates that a case only matches if some predicate holds. The bindings in the pattern are in scope in the guard:

match (x) {
  case (m: i32, n: i32) if m + n < 5 => { return m - n; }
}

For consistency, this facility is also available for default clauses, so that default remains equivalent to case _: auto.

Refutability, overlap, usefulness, and exhaustiveness

Some definitions:

  • A pattern P is useful in the context of a set of patterns C if there exists a value that P can match that no pattern in C matches.
  • A set of patterns C is exhaustive if it matches all possible values. Equivalently, C is exhaustive if the pattern _: auto is not useful in the context of C.
  • A pattern P is refutable if there are values that it does not match, that is, if the pattern _: auto is useful in the context of {P}. Equivalently, the pattern P is refutable if the set of patterns {P} is not exhaustive.
  • A set of patterns C is overlapping if there exists any value that is matched by more than one pattern in C.

For the purpose of these terms, expression patterns that match a constant tuple, struct, or choice value are treated as if they were tuple, struct, or alternative patterns, respectively, and bool is treated like a choice type. Any expression patterns that remain after applying this rule are considered to match a single value from an infinite set of values so that a set of expression patterns is never exhaustive:

fn IsEven(n: u8) -> bool {
  // Not considered exhaustive.
  match (n) {
    case 0 => { return true; }
    case 1 => { return false; }
    ...
    case 255 => { return false; }
  }
  // Code here is considered to be reachable.
}
fn IsTrue(b: bool) -> bool {
  // Considered exhaustive.
  match (b) {
    case false => { return false; }
    case true => { return true; }
  }
  // Code here is considered to be unreachable.
}

When determining whether a pattern is useful, no attempt is made to determine the value of any guards, and instead a worst-case assumption is made: a guard on that pattern is assumed to evaluate to true and a guard on any pattern in the context set is assumed to evaluate to false.

We will diagnose the following situations:

  • A pattern is not useful in the context of prior patterns. In a match statement, this happens if a pattern or default cannot match because all cases it could cover are handled by prior cases or a prior default. For example:

    choice Optional(T:! Type) {
      None,
      Some(T)
    }
    fn F(a: Optional(i32), b: Optional(i32)) {
      match ((a, b)) {
        case (.Some(a: i32), _: auto) => {}
        // ✅ OK, but only matches values of the form `(None, Some)`,
        // because `(Some, Some)` is matched by the previous pattern.
        case (_: auto, .Some(b: i32)) => {}
        // ✅ OK, matches all remaining values.
        case (.None, .None) => {}
        // ❌ Error, this pattern never matches.
        case (_: auto, _: auto) => {}
      }
    }
    
  • A pattern match is not exhaustive and the program doesn’t explicitly say what to do when no pattern matches. For example:

    • If the patterns in a match are not exhaustive and no default is provided.

      fn F(n: i32) -> i32 {
        // ❌ Error, this `match` is not exhaustive.
        match (n) {
          case 0 => { return 2; }
          case 1 => { return 3; }
          case 2 => { return 5; }
          case 3 => { return 7; }
          case 4 => { return 11; }
        }
      }
      
    • If a refutable pattern appears in a context where only one pattern can be specified, such as a let or var declaration, and there is no fallback behavior. This currently includes all pattern matching contexts other than match statements, but the var/let-else feature in #1871 would introduce a second context permitting refutable matches, and overloaded functions might introduce a third context.

      fn F(n: i32) {
        // ❌ Error, refutable expression pattern `5` used in context
        // requiring an irrefutable pattern.
        var 5 = n;
      }
      // ❌ Error, refutable expression pattern `5` used in context
      // requiring an irrefutable pattern.
      fn G(n: i32, 5);
      
  • When a set of patterns have no ordering or tie-breaker, it is an error for them to overlap unless there is a unique best match for any value that matches more than one pattern. However, this situation does not apply to any current language rule:

    • For match statements, patterns are matched top-down, so overlap is permitted.
    • We do not yet have an approved design for overloaded functions, but it is anticipated that declaration order will be used in that case too.
    • For a set of impls that match a given impl lookup, argument deduction is used rather than pattern matching, but impls with the same type structure are an error unless a match_first declaration is used to order the impls. (This is a pre-existing rule and is unchanged by this proposal.)

Rationale

  • Software and language evolution
    • The , _ syntax for struct patterns enables a style where adding a struct member is not a breaking change.
    • The requirement that matches be exhaustive makes it easier to add new cases to a choice type, by requiring the compiler to detect places where the new value is not handled.
  • Code that is easy to read, understand, and write
    • Pattern syntax makes it easier to match complex values.
    • Modeling pattern syntax after expressions eases the burden of learning a new sub-language for pattern-matching: patterns are an extension of expressions, and expressions are a special case of patterns.
    • Requiring exhaustiveness for matches makes control flow easier to understand as there is never a value for which the match skips all cases.
  • Interoperability with and migration from existing C++ code
    • The rules for matching a templated value can be used to replace if constexpr in many cases.

Alternatives considered

Shorthand for auto

We could provide a shorter syntax for name: auto. Proposal #851 considered the following shorthands and decided against using them:

var n: _ = init;
var n = init;

A novel suggestion that avoids some of the disadvantages of those syntaxes would be to use:

var n:= init;

Advantages:

  • Shorter syntax for variables with a deduced type.
  • Potentially allows removal of the auto keyword.

Disadvantages:

  • Appears to introduce a := syntax, but that only arises in cases where an initializer immediately follows the name.
    • Cases such as var (a:, b:) = my_pair; would either be invalid or would not use the := syntax.
    • If we accept such cases, there is a risk of grammar ambiguities.
    • If we reject such cases, we may still want to keep auto around for them, creating inconsistency.
  • Not a complete replacement for auto if we want to also allow things like v: Vector(auto). C++ doesn’t allow the equivalent syntax currently, but it was part of the Concepts TS and seems likely to return at some point.
  • No syntactic difference between accidentally omitting a type entirely and requesting type deduction. However, the mistake of omitting a type but retaining the : seems unlikely, and the : followed by the absence of a type is a signal that something is happening, so this seems to be less of a concern than for the var n = init; syntax.

See discussion topics 1 and 2.

Struct pattern syntax

We could omit the , _ syntax. This would simplify struct patterns, but at the cost of removing a feature that can be useful for reducing verbosity and making library evolution easier.

We could always allow a struct pattern to match a struct with more fields, without requiring a , _ suffix. This would aid evolution by reducing the cases where adding a field to a struct can be a breaking change, but such cases would still exist. Further, this would make matches against a struct-valued expression inconsistent with matches against a struct pattern.

We could use a different syntax instead of , _. Other options that were explicitly considered:

  • , ... seems visually evocative of “and more stuff”, but risks conflicting with variadic syntax or at least being confusing when used in a variadic context, given that variadics are expected to claim ... for pack expansion.
  • , ._ suggests matching a field without specifying a name, but might create an impression of matching just one field, and three low punctuation characters in a row seems to be pushing the limits of readability.

On balance, , _ harmonizes well with the use of _ to introduce a wildcard, without being too visually confusing given that the other use of _ has a following :.

We could remove the {field: type} shorthand and require {.field = field: type}. This may avoid encouraging reusing the field name even when it is not appropriate, and would remove some syntactic sugar that’s not formally necessary. However, we expect this to be a common case whose ergonomics are important.

We could use a different syntax for {field: type} that is less divergent from other struct syntaxes:

  • {.field: type} seems like a contender, but doesn’t work because that is already recognized as a struct type literal. Also, the use of normal binding syntax means that every locally-introduced name is always introduced by name: type where name is not preceded by ..
  • {.=field: type} might be a reasonable mnemonic shorthand for {.field = field: type}, but looks a little surprising. This is probably the best choice if concerns are found with {field: type} syntax.

Type pattern matching

We could treat type deduction as a form of pattern matching. For example, we could allow

fn F(a: Vector(T:! Type)) -> T { return a[0]; }
fn G() -> i32 {
  let v: Vector(i32) = (1, 2, 3);
  // Deduces `T = i32`.
  return F(v);
}

where the value of T is determined by pattern-matching Vector(T:! Type) against the supplied type Vector(i32) of v. And symmetrically:

fn H[m: i32, n: i32](k: i32, (m, n)) { return m + n; }
fn I() {
  // Deduces `m = 2`, `n = 3`.
  H(1, (2, 3));
}

This would ensure consistency between pattern matching and deduction, potentially reducing the number of rules that Carbon developers need to learn.

We find that use of pattern-matching in type position can harm readability. For example, the first of these two examples may be easier to read due to having less nesting:

fn F[T:! Type](x: T);
fn F(x: (T:! Type));

Having distinct syntax for type-level matching and value-level matching helps guide the reader to the correct interpretation, even though the underlying matching process is expected to be similar or identical. As a result, we keep type deduction syntax and pattern matching syntax separate for now:

  • In pattern matching, bindings and wildcards are introduced by nested : / :! patterns, and the right-hand side of a binding pattern is never a proper pattern.
  • In type deduction, deduced values are specified separately and an expression written in terms of those bindings describes the type.

There are some contexts where there is no syntactic location for introducing deduced values, such as in case labels. For syntactic consistency, such cases should be addressed by adding forall syntax, if there is motivation to support deduction:

match (templated_value) {
  case forall [template T:! Type] (p: T*) => { heap.Delete(p); }
}

Introducer syntax for expression patterns

We could have some separate introducer syntax to distinguish expression patterns from other kinds of patterns:

match ((a, b)) {
  case is (1, 2) => {}
  case (is 3, n: i32) => {}
  case (m: i32, is 4) => {}
}

This would reduce the chance of confusion in cases where an expression and a similar-looking pattern are treated differently:

class TupleLike {
  external impl as (i32, i32);
}
fn MatchTupleLike(t: TupleLike) {
  match (t) {
    // ✅ OK, expression pattern;
    // `t` implicitly converted to `(i32, i32)` by
    // built-in `impl EqWith` for tuples.
    case (1, 2) => {}
    // ❌ Error, `t` is not a tuple.
    case (n: i32, 3) => {}
  }
}

This would also permit the same, or overlapping, syntax to be used in expressions and patterns with different meanings, for example:

fn F() {
  // Here, `[...]` is an array type.
  var array: [i32; 5];
  match (&array) {
    // Here, `[...]` could introduce deduced parameters.
    case [T:! Type] x: T* => {}
  }
}

Similarly, if we added a &PATT pattern to match the address of a value, a construct like &n would be ambiguous without in introducer:

var n: i32 = 5;
fn MatchPointerOrPointee(p: i32*) {
  match (p) {
    case is &n => {
      Print("given pointer to n");
    }
    // Here, the pattern `&PATT` could match the address of
    // a value that matches `PATT`.
    case &(is n) => {
      Print("given pointer to i32 whose value equals the value of n");
    }
    case &(m: i32) => {
      Print("given pointer to value {0}", m);
    }
  }
}

Such an introducer would also denote the portions of a pattern that are evaluated at runtime rather than at compile time, benefitting Carbon’s goals of readability and of predictable performance, and would also add syntactic separation between the parts of a pattern that have full exhaustiveness checking and the parts that do not.

However, this would also introduce additional ceremony for the common case where part of a pattern is a specific value. This could be mitigated by permitting certain kinds of value as patterns without an introducer, such as numeric literals and true and false, at the cost of introducing more complexity and more confusion over which cases require is and which do not.

We have also not identified a good choice for the introducer syntax, should we pursue this direction. is is not an ideal choice, because elsewhere in Carbon syntax, it is a relation between a value and its type, so is T may be misread as matching values whose type is T. == expression has been suggested, but that would imply the opposite operand order of that in this proposal – scrutinee == expression rather than expression == scrutinee – which would compare struct fields in a surprising order that diverges from the order of comparison for a struct pattern. A prefix == may also be visually surprising.

For now, we do not add such an introducer, but this decision is expected to be revisited by a future proposal.

See:

Allow guards on arbitrary patterns

We could treat guards as part of pattern syntax instead of as part of case syntax. However, since guards make a pattern refutable, this wouldn’t allow them anywhere other than in cases in the current language design. It would allow them to be nested within cases:

match (x) {
  case (n: i32 if n > 5, "some string") => { ... }
}

Such nesting might allow an expensive later check to be avoided. For example, in the above case we can avoid an == comparison on a string if a cheaper comparison of n > 5 fails. However, this would introduce complexity into the grammar, and it’s not clear that this feature would add sufficient value to justify that complexity.

An additional concern is that if we add letelse syntax, this would presumably permit things like:

let n: i32 if n > 5 = 20 else { return 0; };

… where it would be easy to misparse the if ... else as being a single construct, where the intended parse would be:

let ((n: i32) if n > 5) = 20 else { return 0; };

See also a related Discord discussion.

Treat expression patterns as exhaustive if they cover all possible values

We could do more work to treat a set of expression patterns as being exhaustive, if each pattern has a constant value and between those constant values, all possible values of the type are covered. The advantage of this would be that we improve the precision of our language rules.

This change in rules has some disadvantages and problems:

  • It would add some complexity to the rules and to implementations in order to track whether all possible values have been created.
  • For even the simplest types where this would apply, such as i8, it seems unlikely that a match covering all possible values would be written, due to the large number of patterns required.
  • In many cases, the value being matched will carry an invariant so that matching a subset of the representable values would match all meaningful values. We would still have imprecise rules in those cases.
  • Expression patterns are matched with ==, and for an arbitrary == it is not computable in general to tell whether a given set of values is exhaustive, so we would only be able to apply this in some subset of cases.

This restriction is straightforward to work around by adding a final unreachable default, or by replacing the final value match with a default or _.

Allow non-exhaustive match statements

We could permit match statements that are not exhaustive, and execute none of the case blocks if none of the patterns match. This is a very common choice in the design of such language features. However, it is also a common source of errors, for example when matching a sum type and an alternative is missed. By making this a language rule, we ensure that developers can rely on such mistakes being caught.

There is an easy syntactic way to disable this check, by adding an explicit default => {} case. If we made the opposite choice, we would not automatically have an easy way to request that the check be performed. We could add syntax to request it, but such syntax would likely be forgotten and the default behavior would be that errors are silently permitted. This seems sufficient to outweigh the potential ergonomic cost of requiring the default => {} case to be written explicitly.

Another motivation for requiring exhaustiveness is that it simplifies other language rules. For example, when determining whether control flow can reach the end of a function with a declared return type, a separate exhaustiveness analysis is not necesasry.

One concern with exhaustiveness checking is that it will cause the addition of an alternative to a choice type to be a breaking change by default. However, this is also one of the main advantages, and the design of choice types is intended to eventually provide a mechanism to specify that a choice type is extensible, which if used would mean that a set of patterns for that choice type would only be considered exhaustive if it includes a wildcard pattern.

Future work

Or patterns

We could provide “or patterns”, allowing matching of one pattern or another with the same handler:

match (x) {
  case (m: i32, 0) | (0, m: i32) => { return m; }
}

See the red-black tree rebalancing example for a real-world example where this would result in a simplification.

As patterns

We could provide “as-patterns” as a convenient way to give a name to a value while still matching parts of that value. Following Haskell, we could use:

  • pattern ::= identifier @ pattern

For example:

match (x) {
  // `s` names the first element of the tuple.
  case (s@{.a = n: i32, .b = 12}, 4) if n > 1 => { return s; }
}

Matching classes by dynamic type

We could provide a way to match polymorphic class objects based on their dynamic type. This might be the default when matching a polymorphic class, or might require opt-in. The behavior in this proposal is that only the static type of the operand is considered.

For example, we could default to matching the static type, and allow a dyn pattern syntax for matching a pointer to a polymorphic class type, meaning that we match the dynamic type rather than the static type of the pointer:

abstract class Base { virtual fn F[me: Self](); }
class Derived1 extends Base {}
class Derived2 extends Base {}

fn PrintType(b: Base*) {
  match (b) {
    // `case d1: Derived*` would be invalid here,
    // because it could never match.
    case dyn d1: Derived1* => { Print("Derived1"); }
    case dyn d2: Derived2* => { Print("Derived2"); }
    default => { Print("Unknown derived class"); }
  }
}

fn PrintTemplateType[template T:! Type](p: T*) {
  match (p) {
    // OK, dispatch is based on the static type.
    case b: Base* => { Print("Base"); }
    case d1: Derived1* => { Print("Derived1"); }
    case d2: Derived2* => { Print("Derived2"); }
    default => { Print("Unknown class"); }
  }
}

However, at this time we do not have a design for a checked down-cast, so it’s not clear how this matching operation would fit into the design of classes, either syntactically or semantically.

User-defined pattern matching

We plan to provide a mechanism for allowing a user-defined type to specify how it can be matched by patterns. See proposal #157 for details.

Matching classes with struct patterns

We could allow a class to be matched by a struct pattern that matches its fields. This would make sense especially for data classes, and would be consistent with the behavior of == in the case where a struct is implicitly convertible to the class type. However, without a design for user-defined pattern matching and matching on dynamic type, there is a significant risk that this would conflict with the rules there, so it is deferred for now.

Matching by reference

When the scrutinee is an lvalue, it is sometimes desirable to form a mutable binding to it. For example, see the struct inspection example below. We currently support this only for the me binding in a method, using the addr keyword, but could allow this more generally.

fn takeDamage(p: player*) {
  match (*p) {
    case {.hitpoints = 0, .lives = 0, _} => { gameOver(); }
    case {.hitpoints = addr hp: i32*, .lives = addr l: i32*, _} if *hp == 0 => {
      *hp = 10;
      --*l;
    }
    case {.hitpoints = addr hp: i32*, _} if *hp <= 3 => {
      --*hp;
      messageAlmostDead();
    }
    case {.hitpoints = addr hp: i32*, _} => {
      --*hp;
    }
  }
}

Work in this area will need to consider whether we can provide this feature ergonomically without introducing reference-like behavior.

Type deduction

This proposal does not cover type deduction, instead considering it to be a separate topic from pattern matching syntax, even though the semantic behavior of the two may be quite similar or identical.

We will need a proposal to explore type deduction and describe its functioning.

Match expressions

As demonstrated in the switching an enum example below, it would be valuable to have an expression match syntax in addition to the statement match syntax. We could follow the same approach as for if statements, and say that a match that appears at the start of a statement is a statement match and any other match is an expression match.

As candidate syntax, braced case bodies could be replaced by an expression followed by a comma. For example:

let opengl_color: Vec3 = match (c) {
  case .red    => Vec3.Make(1.0, 0.0, 0.0),
  case .yellow => Vec3.Make(1.0, 1.0, 0.0),
  case .green  => Vec3.Make(0.0, 1.0, 0.0),
  case .blue   => Vec3.Make(0.0, 0.0, 1.0)
};

Examples

These examples are translations of examples in WG21 paper P0095R1 and P1371R3, with permission from the author of those examples, David Sankel. Thank you, David!

Examples from P0095R1

Figure 1. Declaration of a command data structure

C++P0095R1This proposal
```c++ struct set_score { std::size_t value; }; struct fire_missile {}; struct fire_laser { unsigned intensity; }; struct rotate { double amount; }; struct command { std::variant< set_score, fire_missile, fire_laser, rotate > value; }; ``` ```c++ lvariant command { std::size_t set_score; std::monostate fire_missile; unsigned fire_laser; double rotate; }; ``` ```carbon choice command { set_score(u64), fire_missile, fire_laser(u32), rotate(f64) } ```

Figure 2. Implementation of an output operator

C++P0095R1This proposal
```c++ namespace { struct Output { std::ostream& operator()(std::ostream& stream, const set_score& ss) const { return stream << "Set the score to " << ss.value << ".\n"; } std::ostream& operator()(std::ostream& stream, const fire_missile&) const { return stream << "Fire a missile.\n"; } std::ostream& operator()(std::ostream& stream, const fire_laser& fl) const { return stream << "Fire a laser with " << fl.intensity << " intensity.\n"; } std::ostream& operator()(std::ostream& stream, const rotate& r) const { return stream << "Rotate by " << r.degrees << " degrees.\n" } }; } std::ostream& operator<<(std::ostream& stream, const command& cmd) { return std::visit(std::bind(Output(), std::ref(stream), std::placeholders::_1), cmd.value); } ``` ```c++ std::ostream& operator<<(std::ostream& stream, const command& cmd) { return inspect(cmd) { set_score value => stream << "Set the score to " << value << ".\n" fire_missile _ => stream << "Fire a missile.\n" fire_laser intensity => stream << "Fire a laser with " << intensity << " intensity.\n" rotate degrees => stream << "Rotate by " << degrees << " degrees.\n" } } ``` ```carbon impl command as Printable { fn Print[me: Self](stream: Cpp.std.ostream*) { match (me) { case .set_score(value: u64) => { stream->Print("Set the score to {0}.\n", value); } case .fire_missile => { stream->Print("Fire a missile.\n"); } case .fire_laser(intensity: u32) => { stream->Print("Fire a laser with {0} intensity.\n", intensity); } case .rotate(degrees: f64) => { stream->Print("Rotate by {0} degrees.\n", degrees); } } } } ```

Figure 3. Switching an enum

C++P0095R1This proposal
```c++ enum color { red, yellow, green, blue }; ``` ```carbon choice color { red, yellow, green, blue } ```
```c++ const Vec3 opengl_color = [&c] { switch(c) { case red: return Vec3(1.0, 0.0, 0.0); break; case yellow: return Vec3(1.0, 1.0, 0.0); break; case green: return Vec3(0.0, 1.0, 0.0); break; case blue: return Vec3(0.0, 0.0, 1.0); break; default: std::abort(); }(); ``` ```c++ const Vec3 opengl_color = inspect(c) { red => Vec3(1.0, 0.0, 0.0) yellow => Vec3(1.0, 1.0, 0.0) green => Vec3(0.0, 1.0, 0.0) blue => Vec3(0.0, 0.0, 1.0) }; ``` ```carbon // Carbon has neither expression-match nor lambdas yet, // so this can't easily be done in line. fn GetOpenGLColor(c: color) -> Vec3 { match (c) { case .red => { return Vec3.Make(1.0, 0.0, 0.0); } case .yellow => { return Vec3.Make(1.0, 1.0, 0.0); } case .green => { return Vec3.Make(0.0, 1.0, 0.0); } case .blue => { return Vec3.Make(0.0, 0.0, 1.0); } } } let opengl_color: Vec3 = GetOpenGLColor(c); ```

Figure 4. Expression datatype

</tr> </table> #### Figure 5. `struct` inspection
C++P0095R1This proposal
```c++ struct expression; struct sum_expression { std::unique_ptr left_hand_side; std::unique_ptr right_hand_side; }; struct expression { std::variant<sum_expression, int, std::string> value; }; expression simplify(const expression & exp) { if(sum_expression const * const sum = std::get_if(&exp)) { if( int const * const lhsInt = std::get_if( sum->left_hand_side.get() ) && *lhsInt == 0 ) { return simplify(*sum->right_hand_side); } else if( int const * const rhsInt = std::get_if( sum->right_hand_side.get() ) && *rhsInt == 0 ) { return simplify(*sum->left_hand_side); } else { return {sum_expression{ std::make_unique(simplify(*sum->left_hand_side)), std::make_unique(simplify(*sum->right_hand_side))}} } } return exp; } void simplify2(expression & exp) { if(sum_expression * const sum = std::get_if(&exp)) { if( int * const lhsInt = std::get_if( sum->left_hand_side.get() ) && *lhsInt == 0 ) { expression tmp(std::move(*sum->right_hand_side)); exp = std::move(tmp); simplify(exp); } else if( int * const rhsInt = std::get_if( sum->right_hand_side.get() ) && *rhsInt == 0 ) { expression tmp(std::move(*sum->left_hand_side)); exp = std::move(tmp); simplify(exp); } else { simplify(*sum->left_hand_side); simplify(*sum->right_hand_side); } } return exp; } ``` </td> ```c++ lvariant expression; struct sum_expression { std::unique_ptr left_hand_side; std::unique_ptr right_hand_side; }; lvariant expression { sum_expression sum; int literal; std::string var; }; expression simplify(const expression & exp) { return inspect(exp) { sum {*(literal 0), *rhs} => simplify(rhs) sum {*lhs , *(literal 0)} => simplify(lhs) sum {*lhs , *rhs} => expression::sum{ std::make_unique(simplify(lhs)), std::make_unique(simplify(rhs))}; _ => exp }; } void simplify2(expression & exp) { inspect(exp) { sum {*(literal 0), *rhs} => { expression tmp(std::move(rhs)); exp = std::move(tmp); simplify2(exp); } sum {*lhs , *(literal 0)} => { expression tmp(std::move(lhs)); exp = std::move(tmp); simplify2(exp); } sum {*lhs , *rhs} => { simplify2(lhs); simplify2(rhs); } _ => ; }; } ``` </td> ```carbon choice expression { sum(UniquePtr(expression), UniquePtr(expression)), literal(i32), var: String } // This assumes that UniquePtr provides the matching // functionality described in #2187. fn simplify(exp: expression) -> expression { match (exp) { case .sum(.PtrTo(.literal(0)), .PtrTo(rhs: expression)) => { return simplify(rhs); } case .sum(.PtrTo(lhs: expression), .PtrTo(.literal(0))) => { return simplify(lhs); } case .sum(.PtrTo(lhs: expression), .PtrTo(rhs: expression)) => { return expression.sum(MakeUnique(simplify(lhs)), MakeUnique(simplify(rhs))); } default => { return exp; } } } ```
C++P0095R1This proposal
```c++ struct player { std::string name; int hitpoints; int lives; }; ``` ```carbon class player { var name: String; var hitpoints: i32; var lives: i32; } ```
```c++ void takeDamage(player &p) { if(p.hitpoints == 0 && p.lives == 0) gameOver(); else if(p.hitpoints == 0) { p.hitpoints = 10; p.lives--; } else if(p.hitpoints <= 3) { p.hitpoints--; messageAlmostDead(); } else { p.hitpoints--; } } ``` ```c++ void takeDamage(player &p) { inspect(p) { {hitpoints: 0, lives:0} => gameOver(); {hitpoints:hp@0, lives:l} => hp=10, l--; {hitpoints:hp} if (hp <= 3) => { hp--; messageAlmostDead(); } {hitpoints:hp} => hp--; } } ``` ```carbon fn takeDamage(p: player*) { match (*p) { case {.hitpoints = 0, .lives = 0, _} => { gameOver(); } case {.hitpoints = 0, _} => { p->hitpoints = 10; --p->lives; } case {.hitpoints = hp: i32, _} if hp <= 3 => { --p->hitpoints; messageAlmostDead(); } default => { --p->hitpoints; } } } ```
### Example from P1371R5: Red-black tree rebalancing #### With P1371R5 pattern matching ```c++ enum Color { Red, Black }; template struct Node { void balance(); Color color; std::shared_ptr lhs; T value; std::shared_ptr rhs; }; template void Node::balance() { *this = inspect (*this) { [case Black, (*?) [case Red, (*?) [case Red, a, x, b], y, c], z, d] => Node{Red, std::make_shared(Black, a, x, b), y, std::make_shared(Black, c, z, d)}; [case Black, (*?) [case Red, a, x, (*?) [case Red, b, y, c]], z, d] // left-right case => Node{Red, std::make_shared(Black, a, x, b), y, std::make_shared(Black, c, z, d)}; [case Black, a, x, (*?) [case Red, (*?) [case Red, b, y, c], z, d]] // right-left case => Node{Red, std::make_shared(Black, a, x, b), y, std::make_shared(Black, c, z, d)}; [case Black, a, x, (*?) [case Red, b, y, (*?) [case Red, c, z, d]]] // right-right case => Node{Red, std::make_shared(Black, a, x, b), y, std::make_shared(Black, c, z, d)}; self => self; }; } ``` #### With this proposal ```carbon choice Color { Red, Black } class Node(T:! Type) { fn balance[addr me: Self*](); var color: Color; var lhs: SharedPtr(Node); var value: T; var rhs: SharedPtr(Node); } fn MakeBalanced[T:! Type](a: SharedPtr(Node(T)), x: T, b: SharedPtr(Node(T)), y: T, c: SharedPtr(Node(T)), z: T, d: SharedPtr(Node(T))) -> Node(T) { return {.color = Color.Red, .lhs = MakeShared({.color = Color.Black, .lhs = a, .value = x, .rhs = b}), .value = y, .rhs = MakeShared({.color = Color.Black, .lhs = c, .value = z, .rhs = d})}; } fn Node(T:! Type).balance[addr me: Self*]() { match (*me) { {.color = .Black, .lhs = .PtrTo( {.color = .Red, .lhs = .PtrTo( {.color = .Red, .lhs = a: auto, .value = x: T, .rhs = b: auto}), .value = y: T, .rhs = c: auto}), .value = z: T, .rhs = d: auto} => { *me = MakeBalanced(a, x, b, y, c, z, d); } {.color = .Black, .lhs = .PtrTo( {.color = .Red, .lhs = a: auto, .value = x: T, .rhs = .PtrTo( {.color = .Red, .lhs = b: auto, .value = y: T, .rhs = c: auto})}), .value = z: T, .rhs = d: auto} => { *me = MakeBalanced(a, x, b, y, c, z, d); } {.color = .Black, .lhs = a: auto, .value = x: T, .rhs = .PtrTo( {.color = .Red, .lhs = .PtrTo( {.color = .Red, .lhs = b: auto, .value = y: T, .rhs = c: auto}), .value = z: T, .rhs = d: auto})} => { *me = MakeBalanced(a, x, b, y, c, z, d); } {.color = .Black, .lhs = a: auto, .value = x: T, .rhs = .PtrTo( {.color = .Red, .lhs = b: auto, .value = y: T, .rhs = .PtrTo( {.color = .Red, .lhs = c: auto, .value = z: T, .rhs = d: auto})})} => { *me = MakeBalanced(a, x, b, y, c, z, d); } default => {} }; } ``` #### With this proposal plus #2187 ```carbon choice Color { Red, Black } class Node(T:! Type) { fn balance[addr me: Self*](); var color: Color; var lhs: SharedPtr(Self); var value: T; var rhs: SharedPtr(Self); external impl as Match { interface Continuation { extends Match.BaseContinuation; fn Red[addr me: Self*](lhs: SharedPtr(Self), value: T, rhs: SharedPtr(Self)) -> ReturnType; fn Black[addr me: Self*](lhs: SharedPtr(Self), value: T, rhs: SharedPtr(Self)) -> ReturnType; } fn Op[me: Self, C:! Continuation](continuation: C*) -> C.ReturnType { match (me.color) { case .Red => { return continuation->Red(me.lhs, me.value, me.rhs); } case .Black => { return continuation->Black(me.lhs, me.value, me.rhs); } } } } } fn MakeBalanced[T:! Type](a: SharedPtr(Node(T)), x: T, b: SharedPtr(Node(T)), y: T, c: SharedPtr(Node(T)), z: T, d: SharedPtr(Node(T))) -> Node(T) { return {.color = Color.Red, .lhs = MakeShared({.color = Color.Black, .lhs = a, .value = x, .rhs = b}), .value = y, .rhs = MakeShared({.color = Color.Black, .lhs = c, .value = z, .rhs = d})}; } fn Node(T:! Type).balance[addr me: Self*]() { match (*me) { .Black(.PtrTo(.Red(.PtrTo(.Red(a: auto, x: T, b: auto)), y: T, c: auto)), z: T, d: auto) => { *me = MakeBalanced(a, x, b, y, c, z, d); } .Black(.PtrTo(.Red(a: auto, x: T, .PtrTo(.Red(b: auto, y: T, c: auto)))), z: T, d: auto) => { *me = MakeBalanced(a, x, b, y, c, z, d); } .Black(a: auto, x: T, .PtrTo(.Red(.PtrTo(.Red(b: auto, y: T, c: auto)), z: T, d: auto))) => { *me = MakeBalanced(a, x, b, y, c, z, d); } .Black(a: auto, x: T, .PtrTo(.Red(b: auto, y: T, .PtrTo(.Red(c: auto, z: T, d: auto))))) => { *me = MakeBalanced(a, x, b, y, c, z, d); } default => {} }; } ```