Semicolons terminate statements

Pull request

Table of contents

Abstract

Statements, declarations, and definitions will terminate with either a semicolon (;) or a close curly brace (}). Semicolons are never optional.

For example, with a semicolon, x = x + 2; or class C;. With a close curly brace, for ( ... ) { ... }, or class C { ...}.

This does not affect any approved proposal; rather, it makes an important assumption explicit.

Problem

Statements need some system for separation. There are two main options for this:

  1. Require semicolons to terminate statements.
  2. Automatically determine where statements terminate.
    • Some languages, such as Python, define a syntax where a newline terminates statements.
    • Other languages, such as Javascript, require semicolons but define rules for semicolon insertion.

Although Carbon’s design currently assumes semicolons are required, it hasn’t been directly addressed by a proposal.

Background

Discussion in Carbon

This was discussed on leads issue #1924: Semicolon. Some rationale is provided there, stemming from discussion #1739: Semicolon.

In other languages

This blog provides a similar survey of multiple languages.

Requiring semicolons

In C++, C#, and Java, semicolons are always required.

In Rust, semicolons are generally required, but may be omitted for an implicit return. Because blocks are expressions, there are ambiguities in expression statements between parsing as a standalone statement and parsing as part of an expression.

Optional semicolons

In Python, a line is a simple statement, and parentheses are an idiomatic way to create multi-line statements. Semicolons may be used to explicitly separate statements. For example:

value = (
  "text"
)
a = 1; b = 2; c = 3

Swift allows some statements to wrap lines, although multiple statements on the same line (x = 1 x = 1) require a semicolon. The detailed rules aren’t documented so it’s difficult to assess other than that Swift developers are generally happy with the results. Swift’s statements section doesn’t define statement boundaries, and the grammar documents that line-breaks are treated as whitespace. However, there are observable ways the behavior can lead to small mistakes; these may may often be caught by the compiler, but will sometimes be missed. For example:

// One statement in Swift, but two in Python and Kotlin.
var x = 1
      + 1
// Two statements in Swift because of whitespace sensitivity. Second statement
// is a compiler warning.
var x = 1
      +1
// Two calls, the second on the return value of the first.
Make() ()
// A single call followed by an empty tuple. Second statement is valid.
Make()
()

Kotlin permits a newline to be used to terminate statements instead of a semicolon. Kotlin’s grammar explicitly enumerates all the places where newlines can appear (see mentions of NL in the grammar), and doesn’t allow newlines in places where they would introduce ambiguity.

// This is unambiguously parsed as two statements, because
// a newline is not permitted before a `+` operator.
var x = 1
+ 1

In JavaScript and TypeScript, semicolons are part of the formal syntax, and ECMAScript provides Automatic Semicolon Insertion (ASI). Note ECMAScript also documents Interesting Cases which may lead to confusion for developers.

In Go, semicolons are similarly part of the formal syntax, and certain tokens cause a semicolon insertion. This is also used to enforce style, for example by requiring the opening { of an if body to be on the same line in order to avoid semicolon insertion.

Proposal

As described in the abstract, Carbon will require semicolons to terminate statements and forward declarations.

Examples with a semicolon include:

  • Most statements, such as Foo(); and x = x + 2;.
  • var statements and declarations, such as var x: i32 = 0;
  • Forward declarations, such as class C; or fn Foo();.

Examples with a close curly brace include:

  • Statement grammars that terminate with a curly brace, such as if ( ... ) { ... } or match ( ... ) { ... }.
  • Declarations that include a definition, such as class C { ... } or fn Foo() { ... }.
    • This is partly in contrast with C++, which would requires a semicolon in class C { ... };.

Carbon’s current design has been written assuming the above; this is making requiring semicolons an explicit decision.

Rationale

Alternatives considered

Optional semicolons

Semicolons could be made optional. This would most likely be with an approach similar to Python, based mainly on newlines.

Advantages:

  • Languages with optional semicolons are very popular. Python is either the most, or the 2nd most, widely used programming language by most measures (1 2 3).
  • Echoes the direction of evolution in other languages.
    • For example, Swift and Kotlin are recently designed languages that make semicolons optional in ways that work well for developers in practice.
  • Compile-time validation and errors on no-op statements could be used to detect some of the issues that arise with optional semicolons in Python and JavaScript.
  • While optional semicolons seem to get fewer complaints, requiring semicolons is likely to lead to ongoing friction due to the overall trend. This can be seen for languages like Rust (1 2 3 4) or C# (1 2 3).

Disadvantages:

  • Semicolons are a visual anchor for statement termination when scanning code.
  • Requiring semicolons leaves more evolutionary paths available for Carbon. This includes both syntactic changes without introducing ambiguity and implicit returns as in Rust.
    • Although it’s not clear Carbon will fully adopt implicit returns, similar syntactic choices may arise for lambdas.
  • Semicolons are a signal to the compiler about where statements were intended to terminate, and can be used to provide better error detection as a consequence.
    • For contrast, optional semicolons may lead to unintended statements. While ASI’s problems are documented, we expect any optional semicolon approach will lead to some increase in bugs that the compiler cannot detect, if only because fewer mistakes are necessary in order to produce valid but incorrect code.
  • Making code with no semicolons idiomatic may increase the “strangeness” for C++ developers, who are the primary target for Carbon.

Semicolons are expected to be a net benefit, as explained by the rationale.