Lex
Table of contents
Overview
Lexing converts input source code into tokenized output. Literals, such as string literals, have their value parsed and form a single token at this stage.
Bracket matching
The lexer handles matching for ()
, []
, and {}
. When a bracket lacks a match, it will insert a “recovery” token to produce a match. As a consequence, the lexer’s output should always have matched brackets, even with invalid code.
While bracket matching could use hints such as contextual clues from indentation, that is not yet implemented.
Alternatives considered
Bracket matching in parser
Bracket matching could have also been implemented in the parser, with some awareness of parse state. However, that would shift some of the complexity of recovery in other error situations, such as where the parser searches for the next comma in a list. That needs to skip over bracketed ranges. We don’t think the trade-offs would yield a net benefit, so any change in this direction would need to show concrete improvement, for example better diagnostics for common issues.