as
expressions
Table of contents
Problem
We would like to provide a notation for the following operations:
- Requesting a type conversion in order to select an operation to perform, or to resolve an ambiguity between possible operations:
fn Ratio(a: i32, b: i32) -> f64 { // Note that a / b would invoke a different / operation. return a / (b as f64); }
- Specifying the type that an expression will have or will be converted into, for documentation purposes.
class Thing { var id: i32; } fn PrintThing(t: Thing) { // 'as i32' reminds the reader what type we're printing. Print(t.id as i32); }
- Specifying the type that an expression is expected to have, potentially after implicit conversions, as a form of static assertion.
fn Munge() { // I expect this expression to produce a Widget but I'm getting compiler // errors and I'd like to narrow down why. F(Some().Complex().Expression() as Widget); }
In general, the developer wants to specify that an expression should be considered to produce a value of a particular type, and that type might be more general than the type of the expression, the same as the type of the expression, or perhaps might represent a different way of viewing the value.
The first of the above problems is especially important in Carbon due to the use of facet types for generics. Explicit conversions of types to interfaces will be necessary in order to select the meaning of operations, because the same member name on different facet types for the same underlying type will in general have different meanings.
For this proposal, the following are out of scope:
- Requesting a type conversion that changes the value, such as by truncation.
- Converting a value to a narrower type or determining whether such a conversion is possible –
try_as
oras?
operations.
Background
C++ provides a collection of different kinds of casts and conversions from an expression x
to a type T
:
- Copy-initialization:
T v = x;
- Direct-initialization:
T v(x);
- Named casts:
static_cast<T>(x)
const_cast<T>(x)
reinterpret_cast<T>(x)
dynamic_cast<T>(x)
- C-style casts:
T(x)
or equivalently(T)x
- These can do anything that
static_cast
,const_cast
, andreinterpret_cast
can do, but ignore access control on base classes.
- These can do anything that
- List-initialization:
T{x}
- This can do anything that implicit conversion can do, and can also initialize a single – real or notional – subobject of
T
. - Narrowing conversions are disallowed.
- This can do anything that implicit conversion can do, and can also initialize a single – real or notional – subobject of
These conversions are all different, and each of them has some surprising or unsafe behavior.
Swift provides four forms of type casting operation:
x as T
performs a conversion from subtype to supertype.pattern as T
in a pattern matching context converts a pattern that matches a subtype to a pattern that matches a supertype, by performing a runtime type test. This effectively results in a checked supertype to subtype conversion.
x as! T
performs a conversion from supertype to subtype, with the assumption that the value inhabits the subtype.x as? T
performs a conversion from supertype to subtype, producing anOptional
.T(x)
and similar construction expressions are used to convert between types without a subtyping relationship, such as between integer and floating-point types.
In Swift, x as T
is always unsurprising and safe.
Rust provides the following:
x as T
performs a conversion to typeT
.- When there is no corresponding value, some specified value is produced: this conversion will perform 2’s complement truncation on integers and will saturate when converting large floating-point values to integers.
- Conversions between distinct pointer types, and between pointers and integers, are permitted. Rust treats accesses through pointers as unsafe, but not pointer arithmetic or casting.
This cast can perform some conversions with surprising results, such as integer truncation. It can also have surprising performance implications, because it defines the behavior of converting an out-of-range value – for example, when converting from floating-point to integer – in ways that aren’t supported across all modern targets.
Haskell and Scala support type ascription notation, x : T
. This has also been proposed for Rust. This notation constrains the type checker to find a type for the expression x
that is consistent with T
, and is used:
- for documentation purposes,
- to guide the type checker to select a particular meaning of the code in the presence of ambiguity, and
- as a diagnostic tool when attempting to understand type inference failures.
Proposal
Carbon provides a binary as
operator.
x as T
performs an unsurprising and safe conversion from x
to type T
.
- This can be used to perform any implicit conversion explicitly. As in Swift, this can therefore be used to convert from subtype to supertype.
- This can also be used to perform an unsurprising and safe conversion that cannot be performed implicitly because it’s lossy, such as from
i32
tof32
.
This operator does not perform conversions with domain restrictions, such as converting from f32
to i64
, where sufficiently large values can’t be converted. It does not perform operations in which there are multiple different reasonable interpretations, such as converting from i64
to i32
, where a two’s complement truncation might sometimes be reasonable but where the intent is more likely that it is an error to convert a value that does not fit into an i32
.
See changes to the design for details.
Rationale based on Carbon’s goals
- Code that is easy to read, understand, and write
- Providing only unsurprising built-in
as
conversions, and encouraging user-defined types to do the same, makes code easier to understand.
- Providing only unsurprising built-in
- Practical safety and testing mechanisms
- Syntactically distinguishing between always-safe
as
conversions and potentially-unsafe conversions being performed by other syntax makes it clearer which code should be the subject of more scrutiny when reasoning about safety.
- Syntactically distinguishing between always-safe
- Interoperability with and migration from existing C++ code
- The
As
interface provides the same functionality as single-argumentexplicit
constructors andexplicit
conversion functions in C++, and can be used to expose those operations for interoperability purposes and as a replacement for those operations during migration.
- The
Future work
Provide a mechanism for unsafe conversions
We need to provide additional conversions beyond those proposed for as
. In particular, to supply the same set of conversions as C++, we would need at least the following conversions that don’t match the rules for as
:
Conversions with a domain restriction:
- Conversions from pointer-to-supertype to pointer-to-subtype.
- Conversions from floating-point to integer types that assume the input is in-range.
- (Not in C++.) Conversions between any two integer types that assume the input is in-range.
Conversions that modify some values:
- Truncating conversions between any two integer types.
Conversions that reinterpret values:
- Conversions between arbitrary pointer types.
- Conversions between integers and pointers.
- Bit-casts between arbitrary, sufficiently-trivial types of the same size.
Special cases:
- Some analogue of
dynamic_cast
. - Some analogue of
const_cast
.
We will need to decide which of these we wish to provide – in particular, depending on our plans for mutability and RTTI, const_cast
and dynamic_cast
may or may not be appropriate.
For the operations we do supply, we could provide either named functions or dedicated language syntax. While this proposal suggests that the as
operator should not be the appropriate language syntax for the above cases, that decision should be revisited once we have more information from examining the alternatives.
Casting operator for conversions with domain restrictions
We could provide an additional casting operator, such as assume_as
or unsafe_as
, to model conversions that have a domain restriction, such as i64 -> i32
or f32 -> i64
or Base*
-> Derived*
.
Advantage:
- Provides additional important but unsafe functionality.
- Gives this functionality the appearance of being a central language feature.
- Separates safe conversions from unsafe ones.
Disadvantage:
- Increases complexity.
- The connection between these conversions may not be obvious, and the kind and amount of unsafety in practice differs substantially between them.
If we don’t follow this direction, we will need to provide these operations by another mechanism, such as named function calls.
Alternatives considered
Allow as
to perform some unsafe conversions
We could provide a single type-casting operator that can perform some conversions that have a domain restriction, treating values out of range as programming errors.
One particularly appealing option would be to permit as
to convert freely between integer and floating-point types, but not permit it to convert from supertype to subtype.
Advantage:
- Developers many not want to be reminded about the possibility of overflow in conversions to integer types.
- This would make
as
more consistent with arithmetic operations, which will likely have no overt indication that they’re unsafe in the presence of integer overflow. - If we don’t do this, then code mixing differently-sized types will need to use a syntactic notation other than
as
, even if all conversions remain in-bounds. If such code is common, as it is in C++ (for example, when mixingint
andsize_t
), developers may become accustomed to using that “assume in range” notation and not consider it to be a warning sign, thereby eroding the advantage of using a distinct notation.
Disadvantage:
- If we allow this conversion, there would be no clear foundation for which conversions can be performed by
as
and which cannot in general. - An
as
expression would be less suitable for selecting which operation to perform if it can be unsafe. - Under maintenance, every usage of
as
would need additional scrutiny because it’s not in general a safe operation. - This risks being surprising to developers coming from C and C++ where integer type conversions are always safe.
The choice to not provide these operations with as
is experimental, and should be revisited when we have more information about the design of integer types and their behavior.
Allow as
to perform two’s complement truncation
We could allow as
to convert between any two integer types, performing a two’s complement conversion between these types.
Advantage:
- Familiar to developers from C++ and various other systems programming languages.
Disadvantage:
- Makes
as
conversions have behavior that diverges from the behavior of arithmetic, where we expect at least signed overflow to be considered a programming error rather than being guaranteed to wrap around. - Introducing a common and easy notation for conversion with wraparound means that this notation will also be used in the – likely much more common – case of wanting to truncate a value that is already known to be in-bounds. Compared to having distinct notation for these two operations:
- This removes the ability to distinguish between programming errors due to overflow and intentional wraparound by using the same syntax for both, both for readers of the code and for automated checks in debugging builds.
- This removes the ability to optimize on the basis of knowing that a value is expected to be in-bounds when performing a narrowing conversion.
The choice to not provide these operations with as
is experimental, and should be revisited when we have more information about the design of integer types and their behavior.
as
only performs implicit conversions
We could limit as
to performing only implicit conversions. This would mean that as
cannot perform lossy conversions.
Advantage:
- One fewer set of rules for developers to be aware of.
Disadvantage:
- Converting between integer and floating-point types is common, and providing built-in syntax for it seems valuable.
Integer to bool conversions
We could allow a conversion of integer types (and perhaps even floating-point types) to bool
, converting non-zero values to true
and converting zeroes to false
.
Advantage:
- This treatment of non-zero values as being “truthy” and zero values as being “falsy” is familiar to developers of various other languages.
- Uniform treatment of types that can be notionally converted to a Boolean value may be useful in templates and generics in some cases.
Disadvantage:
- The lossy treatment of all non-zero values as being “truthy” is somewhat arbitrary and can be confusing.
- An
as bool
conversion is less clear to a reader than a!= 0
test. - An
as bool
conversion is more verbose than a!= 0
test.
Bool to integer conversions
We could disallow conversions from bool
to iN
types.
Advantage:
- More clearly demarcates the intended semantics of
bool
as a truth value rather than as a number. - Avoids making a choice as to whether
true
should map to 1 (zero-extension) or -1 (sign-extension).- But there is a strong established convention of using 1.
- Such conversions are a known source of bugs, especially when performed implicitly.
as
conversions will likely be fairly common and routine in Carbon code due to their use in generics. As such, they may be written without much thought and not given much scrutiny in code review.var found: bool = false; var total_found: i32 = 0; for (var (key: i32, value: i32) in list) { if (key == expected) { found = true; total_found += value; } } // Include an explicit `as i64` to emphasize that we're widening the // total at this point. // Bug: meant to pass `total_found` not `found` here. add_to_total(found as i64);
Disadvantage:
- Removes a sometimes-useful operation for which there isn’t a similarly terse alternative expression form.
- But we could add a member function
b.AsBit()
if we wanted.
- But we could add a member function
- Does not expose the intended connection between the
bool
type and bits.
We could disallow conversion from bool
to i1
.
Advantage:
- Avoids a surprising behavior where this conversion converts
true
to -1 whereas all others converttrue
to 1.
Disadvantage:
- Results in non-uniform treatment of conversion from
bool
, and an awkward special case that may get in the way of generics. - A conversion from
bool
that produces -1 for atrue
value is useful when producing a mask, for example in(b as i1) as u32
.