Numeric literal semantics
Table of contents
Table of contents
Problem
When a numeric literal appears in a program, we need to understand its semantics:
 What type does it have?
 What value is produced by operations on it?
 When can it validly be used to initialize an object?
Background
In C++, numeric literals have either an integral type or a floatingpoint type. C++ provides permission for implementations to add extended integral types, but in practice (for bad reasons relating to intmax_t
) implementations do not do so, so there are a small finite set of types that any given numeric literal might have:
int
,long
,long long
, orunsigned
versions of thesefloat
,double
, orlong double
The choice of type is determined solely by the literal.
The C++ approach is errorprone and problematic:
 Lossy conversions from literals in initializers are permitted.
 Lossy operations on literals are permitted; for example, on a typical implementation,
1 << 60
has value0
because1
is a 32bit type.  Attempting to naturally express some values has undefined behavior; for example,
int x = 2147483648;
typically results in undefined behavior even when 2147483648 is a validint
value.  Integer literals with value 0 have special semantics that are lost when the integer is passed to a function: “perfect” forwarding doesn’t work for such literals.
 The builtin types are privileged: only the types listed above have literals. There is no syntax for a 64bit integer literal, only for (for example) a
long int
literal, which may or may not 64 bits wide.  The type of a literal can be unpredictable in portable code, as it can depend on which type a particular value happens to fit into.
Proposal
Numeric literals have a type derived from their value, and can be converted to any type that can represent that value.
Simple operations such as arithmetic that involve only literals also produce values of literal types.
Details
Numeric literals have a type derived from their value. Two integer literals have the same type if and only if they represent the same integer. Two real number literals have the same type if and only if they represent the same real number.
That is:
 For every integer, there is a type representing literals with that integer value.
 For every rational number, there is a type representing literals with that real value.
 The types for real numbers are distinct from the types for integers, even for real numbers that represent integers.
var x: i32 = 1.0;
is invalid.
Primitive operators are available between numeric literals, and produce values with numeric literal types. For example, the type of 1 + 2
is the same as the type of 3
.
Numeric types can provide conversions to support initialization from numeric literals. Because the value of the literal is carried in the type, a typelevel decision can be made as to whether the conversion is valid.
The integer types defined in the standard library permit conversion from integer literal types whose values are representable in the integer type. The floatingpoint types defined in the Carbon library permit conversion from integer and rational literal types whose values are between the minimum and maximum finite value representable in the floatingpoint type.
Prelude support
The following types are defined in the Carbon prelude:

An arbitraryprecision integer type.
class BigInt;

A rational type, parameterized by a type used for its numerator and denominator.
class Rational(T:! Type);
The exact constraints on
T
are not yet decided. 
A type representing integer literals.
class IntLiteral(N:! BigInt);

A type representing floatingpoint literals.
class FloatLiteral(X:! Rational(BigInt));
All of these types are usable during compilation. BigInt
supports the same operations as Int(n)
. Rational(T)
supports the same operations as Float(n)
.
The types IntLiteral(n)
and FloatLiteral(x)
also support primitive integer and floatingpoint operations such as arithmetic and comparison, but these operations are typically heterogeneous: for example, an addition between IntLiteral(n)
and IntLiteral(m)
produces a value of type IntLiteral(n + m)
.
Implicit conversions
IntLiteral(n)
converts to any sufficiently large integer type, as if by:
impl [template N:! BigInt, template M:! BigInt]
IntLiteral(N) as ImplicitAs(Int(M))
if N >= Int(M).MinValue as BigInt and N <= Int(M).MaxValue as BigInt {
...
}
impl [template N:! BigInt, template M:! BigInt]
IntLiteral(N) as ImplicitAs(Unsigned(M))
if N >= Int(M).MinValue as BigInt and N <= Int(M).MaxValue as BigInt {
...
}
The above is for exposition purposes only; various parts of this syntax are not yet decided.
Similarly, IntLiteral(x)
and FloatLiteral(x)
convert to any sufficiently large floatingpoint type, and produce the nearest representable floatingpoint value. Conversions in which x
lies exactly halfway between two values are rejected, as previously decided. Conversions in which x
is outside the range of finite values of the floatingpoint type are also rejected, rather than saturating to the finite range or producing an infinity.
Examples
// This is OK: the initializer is of the integer literal type with value
// 2147483648 despite being written as a unary `` applied to a literal.
var x: i32 = 2147483648;
// This initializes y to 2^60.
var y: i64 = 1 << 60;
// This forms a rational literal whose value is one third, and converts it to
// the nearest representable value of type `f64`.
var z: f64 = 1.0 / 3.0;
// This is an error: 300 cannot be represented in type `i8`.
var c: i8 = 300;
fn f[template T:! Type](v: T) {
var x: i32 = v * 2;
}
// OK: x = 2_000_000_000.
f(1_000_000_000);
// Error: 4_000_000_000 can't be represented in type `i32`.
f(2_000_000_000);
// No storage required for the bound when it's of integer literal type.
struct Span(template T:! Type, template BoundT:! Type) {
var begin: T*;
var bound: BoundT;
}
// Returns 1, because 1.3 can implicitly convert to f32, even though conversion
// to f64 might be a more exact match.
fn G() > i32 {
match (1.3) {
case _: f32 => { return 1; }
case _: f64 => { return 2; }
}
}
// Can only be called with a literal 0.
fn PassMeZero(_: IntLiteral(0));
// Can only be called with integer literals in the given range.
fn ConvertToByte[template N:! BigInt](_: IntLiteral(N)) > i8
if N >= 128 and N <= 127 {
return N as i8;
}
// Given any int literal, produces a literal whose value is one higher.
fn OneHigher(L: IntLiteral(template _:! BigInt)) > auto {
return L + 1;
}
// Error: 256 can't be represented in type `i8`.
var v: i8 = OneHigher(255);
Alternatives considered
Use an ordinary integer or floatingpoint type for literals
We could decide on a fixedwidth type based on the form of the literal, for example using a type suffix with some rules to determine what type to pick for unsuffixed literals.
Advantages:
 This follows what C++ does.
 Can determine the type of a floatingpoint number without requiring contextual information.
Disadvantages:
 Surprising behavior when applying an operator to a literal would result in overflow. Even if we diagnose this, a diagnostic that
2147483648
is invalid because it overflows is surprising.  Creates additional literal syntax that users will need to understand.
 May select types that don’t match the programmer’s expectations.
 Whatever types we pick are privileged.
Use same type for all literals
We could give literals a single, arbitraryprecision type (say, Integer
for integer literals and Rational
for real literals).
Advantages:
 Only introduces two new types, not an unbounded parameterized family of types.
 Writing a function that takes any integer literal can be done with more obvious syntax and less syntactic overhead. Instead of:
fn OneHigher(L: IntLiteral(template _:! BigInt));
we could write
fn OneHigher(template L:! Integer);
However, with this proposal, a function taking any integer expression that can be evaluated to a constant can be written as
fn F(template N:! BigInt);
and such a function would accept all integer literals, as well as nonliteral constants.
Disadvantages:
 Our mechanism for specifying the behavior of operations such as arithmetic is based on interface implementations, which are looked up by type. Supporting
impl
selection based on values would introduce substantial complexity.  If we introduce an arbitraryprecision integer type, it would be inconsistent to support it only during compilation. However, if we allow its use at runtime, programs may use it accidentally, with an invisible performance cost. For example,
var x: auto = 123;
would result inx
having an infiniteprecision type, possibly involving invisible dynamic allocation. Under this proposal, the type of
x
is a type that can only represent the value123
; as such,x
is effectively immutable. The arbitraryprecision integer type introduced in this proposal can only be used explicitly by programs naming it.
 Under this proposal, the type of
Allow leading 
in literal tokens
We could treat a leading 
character as part of a numeric literal token, so that – for example – 123
would be a single 123
token rather than a unary negation applied to a literal 123
.
Advantages:
 This would narrowly solve the problem that
INT_MIN
cannot be written directly, without any of the other implications of this proposal.
Disadvantages:
 Makes the behavior of unary

less uniform.  Prevents the introduction of infix or postfix operators that bind more tightly than unary

, such as an infix exponentiation operator:2**2
may be expected to evaluate to 4, not to +4.