ref
parameters, arguments, returns and val
returns
Table of contents
- Abstract
- Problem
- Background
- Proposal
- Details
- Compound return forms and patterns
- Nested binding patterns
- Mututation restriction on objects bound to a value
- No optimization on erroneous behavior
bound
parameters- How addresses interact with
ref
- Improved interop and migration with C++ references
- Part of the expression type system, not object types
- Interaction with
returned var
- Use case:
Deref
interface - Use case: indexing interfaces
- Use case: member binding interfaces
- Use case: class accessors
- Type completeness
- Pointer value representation
- Future work
- Rationale
- Alternatives considered
- No
ref
, only pointers - Remove pointers after adding references
- Allow
ref
bindings in the fields of classes - No call-site annotation
- Top-level
ref
introducer - Allow immutable value semantic bindings nested within variable patterns
- Remove
var
as a top-level statement introducer ref
as a type qualifierbound
would change the default return toval
- Other return conventions
return var
with compound return forms- Other syntax for compound return forms
ref
parameters allow aliasinglet
to mark value returns instead ofval
=>
infers form, not just type
- No
Abstract
- A parameter binding can be marked
ref
instead ofvar
or the default. It will bind to reference argument expressions in the caller and produces a reference expression in the callee.- Unlike pointers, a
ref
binding can’t be rebound to a different object. - This replaces
addr
, and is not restricted to theself
parameter. - A
ref
binding, like a value binding, can’t be used in fields of classes or structs. - When calling functions, arguments to non-
self
ref
parameters are also marked withref
.
- Unlike pointers, a
- The return of a function can optionally be marked
ref
,val
, orvar
. These control the category of the call expression invoking the function, and how the return expression is returned.- These may be mixed for functions returning tuple or struct forms.
- Any parameters whose lifetime needs to contain the lifetime of the return must be marked
bound
. - The address of a
ref
binding isnocapture
andnoalias
. - We mark parameters of a function that may be referenced by the return value with
bound
.
Problem
Reference bindings have come up multiple times:
- as a better alternative to
addr self: Self*
; - for use in lambda captures;
- to model different kinds of C++ references for interop and migration;
- to support nested bindings within a destructured
var
, see issue #5250 and proposal #5164; - for forwarding arguments while preserving expression category;
- to add a feature to pattern matching to modify things after they have been matched;
- to support refactoring code without changing all the uses of a name, a problem we are already seeing with
self
andaddr self
, and would be a point of friction in local pattern matching in the future; and - to support breaking up an expression into pieces without altering the expression category of individual pieces.
Reference returns have also come up before, particularly to support operators such as indexing [
…]
and other functions that should produce a reference expression. It is desirable, though, that this not introduce new memory unsafety concerns, due to returning a reference to something with insufficient lifetime.
In addition, we have been interested in adding other return mechanisms that support returning values in registers in cases that our current convention won’t.
Background
- Carbon has reference expressions.
- Using the
addr
keyword on mutating methods to get aself
with a pointer type was introduced in proposal #722: “Nominal classes and methods”. - Leads issue #5261: “We should add
ref
bindings to Carbon, paralleling reference expressions” supports addingref
bindings to Carbon. - LLVM’s
noalias
attribute is used to mark a pointer as being aliased in only limited ways to enable optimization. Also see LLVM’s pointer aliasing rules. - Marking a pointer as not captured, to allow optimizations, was originally done with LLVM’s
nocapture
attribute, which has becomecaptures(none)
andcaptures(ret: address, provenance)
, which is governed by pointer capture rules. - Clang allows C++ code to use the
clang::lifetimebound
attribute to mark parameters that may be referenced by the return value, in order to detect some classes of use-after-free memory-safety bugs. - C++ has reference types.
Proposal
ref
bindings
We introduce a new keyword ref
. This may be added to a :
binding to mark it as binding to a reference expression, as in:
fn F(ptr: i32*) {
// A reference binding `x`.
let ref x: i32 = *ptr;
// Use of `x` is a reference expression that
// refers to the same object as `*ptr`.
Assert(&x == ptr);
// Equivalent to `*ptr += 1;`.
x += 1;
}
fn G() {
var y: i32 = 2;
F(&y);
Assert(y == 3);
}
The use of the name (x
in the example) of a ref
binding forms a durable reference expression. We ensure that reference expressions formed by way of reference bindings do not dangle. A ref
binding may only bind to a durable reference expression or an expression that can be converted to one. The bound durable reference expression must outlive the ref
binding.
The address of a ref
bound name gives the address of the bound object, so &x == ptr
above. The reference itself does not have an address, and unlike a pointer can’t be rebound to reference a different object.
We remove addr
, and use instead use ref
for the self
parameter when an object is required. Note that the type will change from Self*
to Self
in this case. In the future, we might re-add addr
back if needed.
class C {
// ❌ No longer valid.
fn OldMethod[addr self: Self*]() {
// Previously would dereference `self` in
// the body of the method.
self->x += 3;
}
// ✅ Now valid.
fn NewMethod[ref self: Self]() {
// Now `self` is a reference expression,
// and is not dereferenced.
self.x += 3;
}
// ✅ Other uses are unchanged.
fn Get[self: Self]() -> i32 {
return self.x;
}
var x: i32;
}
Potentially abbreviating the syntax further (to allow ref self
as a short form of ref self: Self
) is left as future work.
The ref
modifier is allowed on :
bindings that are not:
- inside a
var
pattern, - a field of a
class
type, or - a field of a struct type.
fn AddTwoToRef(ref x: i32) {
x += 1;
let ref y: i32 = x;
y += 1;
}
// Equivalent to:
fn AddTwoToRef(ref x: i32) {
x += 1;
let y_ptr: i32* = &x;
*y_ptr += 1;
}
We add support for ref
and var
in a struct pattern when using the shorthand a: T
syntax for .a = a: T
:
let {var a: i32, ref b: i32} = ...;
// Now equivalent to:
let {.a = var a: i32, .b = ref b: i32} = ...;
Note: This takes us one step closer to
{
ambiguity. Previously we could distinguish between a struct literal/pattern and a non-empty block with only up to two tokens of lookahead (the struct cases start with.
or_
or identifier followed by:
, and the block cases don’t). Now we have things like:fn F() -> X { var a: i32 = 0; }
… where we’re getting incrementally closer to ambiguity. We’ve got a few more steps before we get there, though, since we don’t have an
X{...}
expression yet, andvar ...
is only allowed in struct patterns rather than struct expressions. So we’re still fine, but this is cutting down our options for future syntactic expansion a little.
The ref
modifier is forbidden on the bindings in class
or struct type fields.
var outer_size: i32 = 123;
class Invalid {
// ❌ Invalid. We don't currently have runtime `let` bindings in classes,
// or `ref` on `var`s, but the intent is to not have `ref` bindings as fields.
let ref invalid_ref_field: i32 = outer_size;
}
// ❌ Invalid.
var invalid_struct_type_field:
{ref .invalid: i32} = {.invalid = outer_size};
In a function argument list, arguments to non-self
ref
parameters are also marked with ref
. Continuing the example:
var z: i32 = 3;
AddTwoToRef(ref z);
Assert(z == 5);
// No `ref` though on the `self` argument.
var c: C = {.x = 4};
c.NewMethod();
Assert(c.Get() == 7);
Note: It is important that this restriction is syntactic, not just semantic, because it means that
ref
is never the first token of a full expression, and so we know without lookahead that aref
in a pattern context must be the start of a binding pattern, not the start of an expression pattern.
Normally an argument to a non-ref
parameter should not be marked ref
, but it is allowed in a generic context where the parameter may sometimes be ref
.
Expression operators will mostly not take ref
parameters, with these exceptions:
- the address-of operator
&
; - the first operand of the indexing operator
[
…]
; and - the member access operator introduced in proposal #3720: “Member binding operators”
.
.
The statement operators now use ref
instead of pointers:
- the left-hand operand of assignment operators such as
=
and+=
. - the
++
and--
operators.
Even in these cases, the arguments will not be marked with ref
at the call site. (Generally the ref
parameter is the self
parameter, and so wouldn’t be marked. The exception is BindToRef
, but we don’t want to mark its argument with ref
.)
As an experiment, we are saying a pointer formed by taking the address of a ref
bound name is LLVM-captures(none)
and LLVM-noalias
. This means that while a ref
parameter could be passed into a function by address, the restrictions also allow a “move-in-move-out” approach (once we define the move operation), assuming it is not marked bound
. The intent here is to leave the door open to a calling convention using registers and less indirection for small-enough objects.
This means that the following code is invalid:
fn F(ref a: i32, ref b: i32) -> bool;
fn G() -> bool {
var v: i32 = 1;
return F(ref v, ref v);
}
Enforcing this restriction will be part of the memory safety story. Until then, doing this is erroneous behavior. This means that the compiler won’t use those LLVM attributes unless the compiler can itself prove that the restrictions hold.
ref
and val
returns
The return of a function can optionally be marked ref
or val
. These control the category of the call expression invoking the function, and how the return expression is returned.
var global: i32 = 2;
fn ReturnRef() -> ref i32 {
// ❌ Invalid: return 2;
// ✅ Valid: return a reference expression with
// sufficient lifetime.
return global;
}
// Call `ReturnRef` and use the resulting reference.
ReturnRef() += 3;
Assert(global == 5);
// Result of `ReturnRef` can be bound using a `ref`
// binding.
fn AddFive() {
let ref r: i32 = ReturnRef();
r += 5;
}
AddFive();
Assert(global == 10);
fn ReturnVal() -> val i32 {
return 2;
}
// ReturnVal() is a value expression.
let l: i32 = ReturnVal();
// Returning an initializing expression is the
// default.
fn ReturnDefault() -> i32 {
return 2;
}
// ReturnDefault() is an initializing expression.
var j: i32 = ReturnDefault();
// Use `var` to explicitly specify returning an
// initializing expression.
fn ReturnVar() -> var i32 {
return 2;
}
// `ReturnVar()` is the same as `ReturnDefault()`.
- A call to a function declared
-> ref T
is a durable reference expression. The generated code for that function will return the address of aT
object. - A call to a function declared
-> val T
is a value expression. The function will return the value representation ofT
. Since values have no address, the value representation may be returned in registers. - The behavior of a call to a function declared
-> T
is unchanged. It is an initializing expression, returning in place or by copy depending on the initializing representation ofT
. This is the same behavior as-> var T
. - The behavior of
auto
as the return type is unchanged, but now supports an optionalref
,val
, orvar
between the->
andauto
.-> auto
continues to return an initializing expression, as does-> var auto
.-> val auto
returns a value expression, and-> ref auto
returns a durable reference expression. - Using
=>
to specify a return continues to return an initializing expression, as before. See this relevant alternative considered.
A function may have multiple returns, each with their own marker, by using a tuple or struct compound return form.
fn TupleReturn() -> (val bool, ref i32, C) {
return (true, global, {.x = 3});
}
fn StructReturn()
-> {.a: val bool,
.b: ref i32,
.c: C} {
return {.a = true,
.b = global,
.c = {.x = 3}};
}
If the return of a function may reference the storage of one or more parameters to the function, those parameters must be marked bound
. This allows the compiler to diagnose if the function’s return is used after the lifetime of the bound
parameter ends. The semantics of bound
are intended to match the clang::lifetimebound
attribute.
fn Member(bound ref c: C) -> ref i32 {
return c.x;
}
// Lifetime of a pointer includes the lifetime
// of what it points to.
fn Deref(bound p: i32*) -> ref i32 {
return *p;
}
fn Both(bound pc: C*) -> ref i32 {
return p->x;
}
fn Invalid1() -> ref i32 {
var x: i32 = 4;
// ❌ Error: returning reference to `x`
// whose lifetime ends when this function
// returns.
return x;
}
fn Invalid2() -> ref i32 {
var c: C = {.x = 1};
// ❌ Error: returning reference bound to `c`
// whose lifetime ends when this function
// returns.
return Member(c)
}
The address of a bound ref
parameter is the LLVM attribute captures(ret: address, provenance)
instead of captures(none)
.
Details
The intent is that we would encourage using references instead of pointers when possible. Their benefits are related to their limitations, so to get those benefits we should use them when a use is restricted enough to be within those limitations.
Compound return forms and patterns
Mirroring the tuple and struct pattern forms, we also support tuple and struct return forms.
// `->` begins a "return form"
fn F() -> <return-form>;
// Within any return form, if the first token is
// `val`, `ref`, `var`, `(`, or `{`, it is not
// treated as type expression:
// Value return, with type as specified
fn Val() -> val <type-expr>
// Reference return, with type as specified
fn Ref() -> ref <type-expr>
// Initializing return, with type as specified
fn Var() -> var <type-expr>
// Tuple compound return, with a list of
// return forms.
fn TupleCompound() -> ( <return-form>, ... )
// Tuple return, with a list of type
// expressions. Used if all members of the
// list are type expressions.
fn Tuple() -> ( <type-expr>, ... )
// Struct compound return, with a mapping from
// designators to return forms.
fn StructCompound()
-> { .<id>: <return-form>, ... }
// Struct return, used if all of the members
// are type expressions.
fn Struct() -> { .<id>: <type-expr>, ... }
// Otherwise, implicit `var` means returns
// an initializing expression.
fn Other() -> <type-expr>
Note that in the absence of val
, ref
, and var
keywords, the implicit var
is placed in the outermost position, minimizing the number of primitive forms returned. So fn F() -> (i32, i32)
means fn F() -> var (i32, i32)
not fn F() -> (var i32, var i32)
. Generally the var
is left off if not required, and so will be rare in return forms, to minimize confusion with val
.
fn TupleReturn(...)
-> (val bool, ref i32, C);
// Equivalent to:
// -> (val bool, ref i32, var C);
let (a: bool, ref b: i32, var c: C)
= TupleReturn(...);
fn StructReturn(...)
-> {.a: val bool,
.b: ref i32,
.c: C};
// Equivalent to:
// -> {.a: val bool,
// .b: ref i32,
// .c: var C};
// Binds to the names `x`, `y`, `z`:
let {.a = x: bool,
.b = ref y: i32,
.c = var z: C} = StructReturn(...);
// Binds to the names `a`, `b`, `c`:
let {a: bool,
ref b: i32,
var c: C} = StructReturn(...);
// Above two can be mixed, binding to
// names `a`, `y`, `z`.
let {a: bool,
.b = ref y: i32
.c = var z: C} = StructReturn(...);
Only types are allowed after a -> val
, -> ref
, or -> var
, not a compound return form. Examples:
// Returns a tuple of type
// `(bool, f32, C, i32)`.
fn OneTupleReturn(...)
-> (bool, f32, C, i32);
// Returns a compound tuple form
fn CompoundReturn(...)
-> (bool, val f32);
// Equivalent to:
// -> (var bool, val f32);
// ❌ Invalid, can't specify `ref` inside
// of `val`.
fn Invalid(...) -> val (bool, ref f32);
The compound return forms may be nested, as in:
fn CompoundInParens(...)
-> ({.a: bool, .b: val f32}, C, ref i32);
// Equivalent to:
// -> ({.a: var bool, .b: val f32}, var C, ref i32);
let ({.a = var x: bool, .b = val y: f32},
var c: C, ref d: i32) = CompoundInParens(...);
// or without renaming:
let ({var a: bool, b: f32},
var c: C, ref d: i32) = CompoundInParens(...);
// Contrast with a compound tuple form containing
// a struct type (not compound):
fn StructInParens(...)
-> ({.a: bool, .b: f32}, C, ref i32);
// Equivalent to:
// -> (var {.a: bool, .b: f32}, var C, ref i32);
fn CompoundInBraces(...)
-> {.a: bool, .b: (val f32, C), .c: ref i32};
// Equivalent to:
-> {.a: var bool, .b: (val f32, var C), .c: ref i32};
let {a: bool,
.b = (x: f32, var y: C),
ref c: i32} = ParensInBraces(...);
// Contrast with a compound struct form containing
// a tuple type (not compound):
fn TupleInBraces(...)
-> {.a: bool, .b: (f32, C), .c: ref i32};
// Equivalent to:
-> {.a: var bool, .b: var (f32, C), .c: ref i32};
This feature is intended to support cases like enumerate
that will want to return a value for the index but a reference to the element of the sequence being enumerated.
Nested binding patterns
Since a ref
binding may only bind to a durable reference expression, it can’t be used to bind the result of a function returning an initializing expression. However, if the initializing expression is bound to a var
, any nested patterns are reference binding patterns bound to the subobject, following proposal #5164: “Updates to pattern matching for objects”.
For example:
fn F() -> (bool, (C, i32));
let (b: bool, var (c: C, i: i32)) = F();
is equivalent to:
fn F() -> (bool, (C, i32));
let (b: bool, var v: (C, i32)) = F();
let ref c: C = v.0;
let ref i: i32 = v.1;
Note that ref
is disallowed inside var
since that would be redundant.
Mututation restriction on objects bound to a value
Mutation of objects with a non-copy-value representation in an active value binding (“borrowed objects”) is erroneous behavior.
- Our plan is to prevent mutation of borrowed objects in Carbon’s strict safe dialect.
- We should only relax our stance here and consider making such mutation allowed if we discover difficulty with this that we cannot overcome.
- But we should revisit the underlying idea of mutation being erroneous if enforcing it in the strict mode proves fundamentally untenable due to ergonomic costs.
- There will always be the potential for unchecked code, either unsafe Carbon code or C++ code by way of interop, to mutate a borrowed object, hence the need to define it as erroneous behavior.
- There is no need to ever make it anything more than erroneous behavior, see below.
- If we can prove the mutation doesn’t occur, then we can use that to optimize under “as-if”, and we don’t need anything else.
- We are deferring the decision of whether strict enforcement is enabled in Carbon’s current C++-friendly mode, when not explicitly marking the code as “unsafe.”
This was discussed 2025-05-22 and then made the subject of leads issue #5524.
No optimization on erroneous behavior
The Carbon compiler should not optimize on erroneous behavior, ever, unless the compiler literally proves that it does not occur, ever. In which case, we don’t need any license to start optimizing on this, as it falls under “as-if”.
The fact that undefined behavior (“UB”, cppreference, wikipedia) provides “as-if” without a proof is precisely the risk of using UB for any semantics, and why we don’t use it here and elsewhere we use erroneous behavior.
bound
parameters
It is erroneous behavior to return something that references a local object that won’t live once the function returns, even if it is a parameter marked bound
. Local objects include local variables, temporary objects, and var
parameters.
fn Invalid1() -> i32* {
var x: i32 = 4;
// ❌ Invalid.
return &x;
}
fn Invalid2(bound x: i32) -> ref i32 {
var y: i32 = x;
// ❌ Invalid.
return y;
}
// ✅ Valid
fn Valid1(bound p: i32*) -> ref i32 {
return *p;
}
fn Invalid3(bound var x: i32) -> ref i32 {
// ❌ Invalid: lifetime of `var` parameter
// ends when function returns.
return x;
}
class ReturnMember {
// ✅ Valid
fn ValidRef[bound ref self: Self]() -> ref i32 {
return self.m;
}
// ❌ Invalid: can't return reference to value.
fn InvalidVal[bound self: Self]() -> ref i32 {
return self.m;
}
// ❌ Invalid: `var self` lifetime ends.
fn InvalidVar[bound var self: Self]()
-> ref i32 { return self.m; }
var m: i32;
}
class DerefPointerMember {
// ✅ Valid
fn ValidRef[bound ref self: Self]() -> ref i32 {
return *self.pm;
}
// ✅ Valid
fn ValidVal[bound self: Self]() -> ref i32 {
return *self.pm;
}
// ✅ Valid
fn ValidVar[bound var self: Self]()
-> ref i32 { return *self.pm; }
var pm: i32*;
}
Otherwise, bound
parameters and global variables are the sources of storage that can be referenced by a return, but need not be referenced, particularly not on every code path.
// Result references `r` if `b` is true, and `p`
// otherwise. Valid as long as both are marked `bound`.
fn Conditional(b: bool, bound ref r: C, bound p: C*)
-> ref C {
if (b) {
return r;
} else {
return *p;
}
}
The parameters of functions defined in an interface may also be marked as bound
. The impl
of that interface for a type can omit occurrences of bound
from the interface, but cannot add new ones.
interface I {
fn F[bound ref self: Self]
(a: Self, bound b: Self, bound c: Self*)
-> ref Self;
}
impl C1 as I {
// ✅ Valid: matches interface
fn F[bound ref self: Self]
(a: Self, bound b: Self, bound c: Self*)
-> ref Self;
}
impl C2 as I {
// ✅ Valid: proper subset of `bound` params
fn F[ref self: Self]
(a: Self, bound b: Self, c: Self*)
-> ref Self;
}
impl C2 as I {
// ❌ Invalid: `a` is not bound in `I.F`.
fn F[ref self: Self]
(bound a: Self, b: Self, c: Self*)
-> ref Self;
}
Like [[clang::lifetimebound]]
in C++, bound
does not affect semantics or calling conventions, just what code is legal. This helps avoid mismatches between typechecking against the signatures in an interface when the impl
functions are different. Exception: the question of whether bound
affects the lifetime of temporaries is future work.
Note that all combinations of a val
/ref
/default return can be bound to a value/ref
/var
parameter. Examples:
fn RefToVal(bound ref x: C) -> val D { return x.d; }
fn ValToRef(bound y: C) -> ref D { return *y.ptr; }
fn VarToRef(bound var p: i32*) -> ref i32 { return *p; }
fn VarToDefault(bound var p: i32*) -> i32* { return p; }
For full safety, we need each bound parameter to be immutable for the duration of the lifetime of the returned result. However, the objective for now is only matching [[clang::lifetimebound]]
, which has the goal of preventing some classes of bugs, not full memory safety. We will reconsider this with the memory safety design.
Clang’s lifetimebound
attribute also only applies to the immediately pointed to objects (by pointers or reference parameters, or pointers or reference subobjects of an aggregate parameter). We suggest a simpler, transitive model here that is more restrictive but should be compatible. That said, pinning down the exact and firm semantics of bound
, especially in these complex cases, is deferred to the full memory safety design as well.
How addresses interact with ref
The address of a ref
binding is noalias
and either captures(none)
or captures(ret: address, provenance)
, depending on whether the binding is marked bound
.
noalias
means like Crestrict
; you can’t observe mutations through aliases; mutation through a restricted pointer is not observable through another pointercaptures(none)
means there is no transitive escape: you can pass a nocapture pointer to another nocapture function, but you can’t store to memory or returncaptures(ret: address, provenance)
: is likecaptures(none)
but may be referenced by a return.
The combination of noalias
and captures(none)
semantics are the minimum for the “move-in-move-out” optimization. But this condition is hard to check, so safe code will use a stricter criteria. Unsafe code will be required to adhere to just the noalias
restrictions, but will not be checked (except possibly by a sanitizer at runtime). The details here will be tackled as part of the memory safety design.
Optimizations will only be performed based on information that is enforced or checked by the compiler, so these attributes won’t be passed to LLVM unless their requirements can be established. This avoids introducing undefined behavior, which we particularly don’t want to do in situations where C++ doesn’t.
The goal of these rules it to nudge us towards function boundaries that don’t constructively create aliasing in their API boundary and don’t capture pointers unnecessarily.
These restrictions are experimental, and we should keep track of everything we end up needing to do to work around these restrictions so any reconsideration can be properly informed.
Improved interop and migration with C++ references
We expect this to improve interop and migration by allowing significantly more interface similarity between Carbon and C++. Previously, many things in C++ that used references on interface boundaries would be forced to switch to pointers. This adds ergonomic friction both at a basic level because of the forced change but also a deeper level because it will make it significantly harder to see the parallel usage across the boundary between C++ and Carbon. With reference bindings, the vast majority of this dissonance will be removed.
This does create a migration concern, raised in open discussion on 2025-05-01, that the nocapture
and noalias
modifiers don’t match C++ restrictions, particularly on the this
parameter that we are going to require migrate to ref self
. We may have to add back in addr
to allow a different pointer type for those cases.
Future work: Addressing how we model the various kinds of C++ references that Carbon code may need to interact with is something we are actively considering and will be tackled in a future proposal.
Part of the expression type system, not object types
Much like value/val
and var
bindings, ref
binding and the new return forms are are part of the type system, but only through expression categories, patterns (function parameters and so on), and returns. Specifically, we don’t expect them to be part of the object types in Carbon. Like value bindings, we retain a great deal of implementation flexibility around layout, and the specifics of how they are lowered.
This specifically means we will need to incorporate ref
bindings into the Call
interface and we will be adding complexity there that will need to be handled by overloading. The changes to the Call
interface is future work, and overloading, once we add support, will need to carry additional complexity to handle ref
.
Interaction with returned var
The rule is: returned var
may only be used when there is a single atomic return form, and it is the default var
category.
// ✅ Allowed
fn F(...) -> V {
returned var v: V = ...;
// ...
return var;
}
fn F(...) -> {var .a : T} {
// ❌ Invalid: composite form
returned var ret: T = ...;
// ...
}
fn F(...) -> val T {
// ❌ Invalid: value return
returned var ret: T = ...;
// ...
}
We can revisit and expand this later if this does not handle use cases we would like to support.
Use case: Deref
interface
To support customization of the prefix-*
dereferencing operator, we introduce the Deref
interface.
interface Deref {
let Result:! type;
fn Op[bound ref self: Self]() -> ref Result;
}
final impl forall [T:! type] T* as Deref {
where Result = T;
fn Op[bound self: Self]() -> ref T
= "builtin.deref";
}
Then *p
is rewritten to p.(Deref.Op)()
, and p->m
is rewritten to p.(Deref.Op)().m
. For example, this might be used by a smart pointer:
class SmartPtr(T:! type) {
fn Make(p: T*) -> Self { return {.ptr = p}; }
impl as Deref {
where Result = T;
fn Op[bound ref self: Self]() -> ref Result {
return *self.ptr;
}
}
private var ptr: T*;
}
Use case: indexing interfaces
Proposal #2274: “Subscript syntax and semantics” added the interfaces used to support indexing with the subscripting operator [
…]
. We change these in the following ways:
- The
addr self
parameters are changed tobound ref self
, to allow the result to reference theself
object. - The
At
method returns byval
. - The
Addr
methods are renamedRef
and return a reference instead of a pointer that is automatically dereferenced.
This proposal’s PR makes those changes to the indexing design.
Use case: member binding interfaces
The member binding interface used for reference expressions from proposal #3720 can now be changed to use references instead of pointers.
Before:
// For a reference expression `x` with type `T`
// and an expression `y` of type `U`, `x.(y)` is
// `*y.((U as BindToRef(T)).Op)(&x)`
interface BindToRef(T:! type) {
extend impl as Bind(T);
fn Op[self: Self](p: T*) -> Result*;
}
After:
// For a reference expression `x` with type `T`
// and an expression `y` of type `U`, `x.(y)` is
// `y.((U as BindToRef(T)).Op)(x)`
interface BindToRef(T:! type) {
extend impl as Bind(T);
fn Op[self: Self](bound ref p: T) -> ref Result;
}
Similarly, the BindToValue
interface is changed to use a val
/value return, potentially avoiding a copy of large objects.
Before:
interface BindToValue(T:! type) {
extend Bind(T);
fn Op[self: Self](x: T) -> Result;
}
After:
interface BindToValue(T:! type) {
extend Bind(T);
fn Op[self: Self](bound x: T) -> val Result;
}
Use case: class accessors
A ref
return can be used to expose the state of an object in a way that can be mutated:
class Four {
fn Get[self: Self](i: i32) -> i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
fn GetMut[bound ref self: Self](i: i32) -> ref i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
private var m: array(i32, 4);
}
var x: HasMember = {.m = (0, 2, 4, 6)};
x.GetMut(2) += 1;
fn Check(y: Four) {
Assert(y.Get(2) == 5);
}
Check(x);
Future work: this will in the future often be done with an overloaded method, as in:
class Four {
overload Access {
fn [bound ref self: Self](i: i32) -> ref i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
fn [self: Self](i: i32) -> i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
}
private var m: array(i32, 4);
}
var x: HasMember = {.m = (0, 2, 4, 6)};
x.Access(2) += 1;
fn Check(y: Four) {
Assert(y.Access(2) == 5);
}
Check(x);
This may be a common enough use case that we will want to introduce a dedicated syntax:
class HasMember {
fn Access[bound ref? self: Self](i: i32) -> ref? i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
private var m: array(i32, 4);
}
Type completeness
Not a change by this proposal, but note that our existing rules will require the type in a ref
binding to be complete in situations where it would not need to be if you were using a value binding with a pointer type instead. We may need to change this in the future to match C++ which treats reference types more like pointer types for completeness purposes.
After this change, a ref
binding to type T
will require T
to be complete in the same situations that other bindings to type T
require T
to be complete.
Pointer value representation
Purely as a change in syntax, the way to specify that the value representation of a class uses a pointer is changed from writing const Self *
to const ref
.
Future work
Temporary lifetimes
For safety, we need bindings and returns that reference storage to only be used while that storage remains valid. When the referenced storage is owned by a temporary, we have a choice to either control the lifetime of the temporary or diagnose when the lifetime of the temporary is insufficient. Deciding on our policy is future work.
Note that in many cases we can explicitly provide storage in a variable instead of referencing a temporary. For example, using var x: ... = ReturnsATemporary();
instead of let ref x: ... = ReturnsATemporary();
. This won’t apply in all situations, though, such as temporaries that are reachable transitively through pointers.
ref
bindings in lambdas
We have already identified future work to support reference captures in lambdas as part of proposal #3848. This might be a reason to support ref
bindings as fields of objects, with all the restrictions that comes with that.
Interaction with effects
We still need to determine how references and the other return types interact with effects, like Optional
, errors, co-routines, and so on. For example, we don’t want to give up the benefits of being able to directly return a reference when a function has an error path.
It is unclear if this will mean putting references into the object type system, but we may be able to handle this with additional types or the ability to customize return representations. For example, we might have an alternate version of the Optional
type that holds a reference:
class OptionalRef(T:! type) {
fn Make(bound ref r: T) -> Self {
return {.p = &r};
}
fn MakeEmpty() -> Self {
return {.p = Optional(T*).MakeEmpty()};
}
fn HasValue[self: Self]() -> bool {
return p.HasValue();
}
fn Get[bound ref self: Self]() -> ref Result {
Assert(self.HasValue());
return *self.p.Get();
}
private var p: Optional(T*);
}
More precise lifetimes
More precise lifetime tracking will be considered with the memory safety design. For example, the bound
approach does not distinguish different components of a compound return, or different parts of a parameter object that might have different lifetimes.
Combining with compile-time
We plan to support references to compile-time state when executing a function at compile time. That will be part of a future proposal.
Interaction with Call
or other interfaces
For now, ref
is not represented in the Call
interface introduced in proposal #2875: Functions, function types, and function calls. This will be tackled together in a future proposal with other aspects of bindings not represented by the type, such as var
and compile-time, along with being generic across these aspects of bindings.
Destructuring assignment
Having more support for multiple returns from a function opens the question of how to do different things with the different returns. We may want a syntax for saying some of the returns are bound to new names, and some are used in assignments to existing variables. One possibility would be to have some pattern syntax for re-initializing an existing object, as in:
fn F() -> (-> bool, ->T);
fn G() {
var x: T = ...;
Consume(~x);
let (b: bool, init x) = F();
// Continue to use `x`...
}
This was discussed in the #syntax channel on Discord on 05-19-2025.
Restore addr
There are two reasons we might restore addr
as an alternative to ref
for the implicit self
parameter:
- As a way to express uses of
self
that we want to disallow forref
bindings generally but need an escape hatch for migrated C++ code that relies on these patterns. - To allow the
self
parameter to use one of the pointer semantics we create as part of the upcoming memory safety design, that can’t be achieved withref
.
Rationale
This proposal tries to advance these Carbon goals:
- Performance-critical software
- Having a “move-in-move-out” option as a calling convention is a potential performance improvement for using
ref
parameters instead of pointers. - Giving additional options for the return convention gives opportunities for improved performance. Having this set by explicit return markings is about giving control and predictability to the code author.
- Having a “move-in-move-out” option as a calling convention is a potential performance improvement for using
- Code that is easy to read, understand, and write
ref
bindings and returns avoid the ceremony of round-tripping through pointers.
- Practical safety and testing mechanisms
- Checking that reference bindings are not dangling is important for avoiding use-after-free bugs.
bound
markings on parameters to allow safety equivalent to Clang’s[[clang::lifetimebound]]
when returning a reference.
- Interoperability with and migration from existing C++ code
- Including references in Carbon allows for less mismatch for C++ code using references.
Alternatives considered
These ideas were discussed in open discussion on:
- 2025-05-01
- 2025-05-06
- 2025-05-07 a b
- 2025-05-08
- 2025-05-12
- 2025-05-13
- 2025-05-14
They were also discussed in the #pointers-and-references channel in Discord starting 2025-05-05, and #syntax on 2025-05-14.
No ref
, only pointers
The rationale to add ref
instead of staying with pointers was discussed on 2025-05-01. In addition to the motivating problems given in the “Problem” section, that discussion included some additional depth to the reasons to add reference bindings.
There is a tension between wanting to have mutating expressions and only having pointers. You need some concept like a reference in order to mutate an object with an expression. The question is how small a box the references are restricted to, and where the line is drawn. C has lvalues, which contain references but are restricted to a quite small box. Reference bindings specifically are about keeping a small box around references while still adding enough expressivity to support our use cases. We have started with a model similar to C, but it fell down when it comes down to composition. Decomposing an expression into pieces loses the tools the expression provided to you. The missing tool for that was reference bindings.
We saw how much we were leaning on value bindings. The asymmetry between having value binding but not reference bindings when have value expressions and reference expressions was creating pressure. For example, when accessing members of an object, we had to escape to pointers in that operator.
One downside of this change is that before indirections were more visible in the code.
Also, this does fundamentally mean that we now have another kind of “pointer”, potentially adding complexity to any memory-safety story. However, this ship already sailed to some extent with value bindings. Fundamentally, bindings are allowed to have pointer-like semantics from a lifetime perspective, and so will need to be considered as a pointer-like thing as we build out lifetime safety.
Remove pointers after adding references
If we removed pointers after adding references, we would need something rebindable for assignable objects. The viable path forward without separate pointers and references is to have something rebindable like pointers but automatically dereferenced like references, which is the approach Rust takes. See this comment on issue #5261.
One of the features of a reference is what it cannot do, and we would have to remove those restrictions to be able to satisfy the pointer use cases with references.
Allow ref
bindings in the fields of classes
A type with reference binding fields would need a lot of restrictions since reference bindings are not assignable. We did not see enough motivation to put references into objects, given the complexity that it would introduce, so we are keeping references out of types for now. This could change to support lambda reference captures.
No call-site annotation
This question was discussed on 2025-05-07.
We decided that the marking is not about lifetime, but ability to mutate. A val
may reference an object in a similar way to a ref
, restricting operations on the original object, but we are not going to mark val
s since those restrictions are enforced by the compiler. We thought the ability to mutate, though, was something important enough to highlight to readers of the code, even at the expense of extra work for the writer.
Swift inout
parameters are marked at caller with an &
before the argument. Jordon Rose has published a regret that they didn’t use inout
to mark the argument instead of &
.
On the other hand, not marking is not known to be a source of bugs.
This is a “try it and see how well it works” sort of decision.
Top-level ref
introducer
For now, we don’t believe let ref
to be so common as to need a shorter way to write, unlike what we do for var
. This was considered in leads issue #5523, which provided this rationale:
I feel like this would often be used for non-local mutation due to it fundamentally deporting mutable value semantics and instead having reference semantics. Unlike the local mutation, that seems more worthwhile to have incentives around minimizing.
However, this seems the easiest of all to revisit later if we discover that the added verbosity in practice is costing more than is worth any improvements from explicitly flagging mutable reference semantics, or if we find code is reaching for antipatterns due to the incentive.
In addition, this comment on leads issue #5522 argued that it would be more consistent for ref
and val
to only apply to bindings, and not introduce patterns, like let
and var
.
Allow immutable value semantic bindings nested within variable patterns
This was considered in leads issue #5523, which provided this rationale:
While it may be an obvious point of orthogonality, I think it adds choice without sufficient motivation, and even having that choice does add some complexity to the language.
It also seems like we could add this later if there is sufficient demand when we have larger usage experience body to pull from with the rest of the Carbon language. Currently, the affordance that feels more natural to me is what we have.
I think we’re happy to see motivating use cases and revisit this. At the moment, we’ve just not seen motivating use cases – everything has seemed a bit too contrived.
Remove var
as a top-level statement introducer
This was considered in leads issue #5523, which provided this rationale:
Locals are important, frequent, and frequently mutable. I don’t think forcing varying locals to go through
let var
for orthogonality aids readability enough to offset the verbosity cost of added keywords on a reasonably common pattern.I’m still currently in the position that locally owned objects being mutable should not be “discouraged” or “disincentivized” by the language. And I think adding artificial incentives to try and avoid needing a mutable local variable would either have no effect beyond verbosity, or if it did have effect, it wouldn’t be a net positive effect due to code being written in a less straightforward manner in order to avoid mutation.
To be clear, this is based on intuition and judgement based on my experience, not in any way based on data or specific motivating examples. I can imagine data or evidence or even a new perspective changing my position here, but so far the discussion we’ve had haven’t done that.
ref
as a type qualifier
The big concern is that any effect that is represented by a type, like Optional
or Result
, will want to compose with reference returns. This could be done by allowing ref
to create an object type that could be used as a parameter to those, as in Optional(ref T)
, but we are trying to avoid going down that path. We have future work to tackle this problem specifically.
There was also a concern that we might need ref
types to represent argument lists with tuples, but tuples already can’t represent var
or compile-time parameters. We have other plans for this, instead of trying to stretch tuples to encompass these use cases.
We also noted that including references in the type system led to a number of inconsistencies in C++, such as no there not being references of references.
bound
would change the default return to val
We considered saying that bound
would change the default return to use the ->val
return convention. This was discussed on 2025-05-01 and 2025-05-08. The idea is that val
is expected to be efficient, so we should encourage using it, but we can’t always use val
, since some types have a reference value representation, but bound
alleviates that concern.
Once we realized that bound
is relevant for all return conventions, we reconsidered that approach, since has a number of concerns:
- Changing defaults is action at a distance, changing the behavior without changing the code in the relevant location.
- We don’t want to have to make changes to the return category of copy-paste of a function from an interface to an
impl
of it when removingbound
. - Lifetimes in Rust and Clang’s
[[clang::lifetimebound]]
don’t change calling conventions, only what code is valid.
Going with an approach where less depends on bound
makes sense for now, since we are going to reconsider these issues as part of our upcoming memory safety work.
Other return conventions
We also considered other conventions for returning from functions, in a comment on #5434, on 2025-05-08 and on 2025-05-12, most notably:
- in place: This convention was like
->
, but always using the “in place” convention where the caller allocates storage and provides the callee with its address, the callee initializes the storage at that address, and the caller is responsible for destroying after the return. - var without storage: The callee returns a pointer to the storage of a subobject of a
bound var
parameter, that caller is then responsible for destroying. A call to this function is reference expression, but with additional responsibility to destroy. - hybrid: If the type has a copy value representation or trivial destructive move then return the object representation directly; otherwise caller passes a pointer and callee initializes it.
There were also some variations on what the conditions for returning in registers using the default return convention.
We seriously considered “var without storage”, but the fact that it couldn’t reliably be used to initialize a variable, particularly in the middle of an object, meant it did not seem valuable enough to include.
It seemed more valuable to support the “in place” return convention. That return form allows you to guarantee knowing the address of the object being constructed, and was a good match for returned var
. However, we realized that var
declarations shouldn’t always be associated with in-memory storage, in particular for types that may be trivially moved. For example, a var
parameter with the C++ type std::unique_ptr<T>
should be passed in registers. A function returning a std::unique_ptr<T>
in place would not be as efficient as returning it by moving it into registers.
return var
with compound return forms
We considered various syntax options on 2025-05-12, but none of them seemed good enough to justify inclusion at this time:
fn F(...) -> (ref R, val L, V) {
// No longer a `var` being returned. Ideally these
// shouldn't have to be initialized together.
returned ??? (ref r: R, val l: L, var v: V) = ...?
return var;
}
// We could restrict to one `var` return component,
// but this is a lot of machinery for a small increase
// in expressiveness and applicability.
fn F(...) -> (ref R, val L, V) {
returned var v: V = ...;
let l: L = ...;
return (*r, l, var);
}
fn F(...) -> (ref R, val L, V) {
// These don't have the right category, and ideally
// shouldn't have to be initialized together.
returned var ret: (R, L, V) = ...?
return var;
}
fn F(...) -> (ref R, val L, V) {
returned var (_, _, var v: V) = <what goes here?>;
}
There was another approach we considered for returned var
originally:
fn F(...) -> (ref R, val L, v1: V1, var v2: V2) {
// ...
// Must use the same names for the `var` (implicit or explicit) returns
// with bound names.
return (r, l, v1, v2);
}
But this had downsides that still apply:
- Requires
V1
andV2
to have unformed states. Otherwise,v1
andv2
would need be initialized when they are declared. - This does not support only having some branches use
return var
.
Our current approach handles our main use case for returned var
: factory functions.
We could support an “only var
s” approach in the future if we want:
fn F(...) -> (var V1, var V2, var V3) {
returned (var v1: V1, var v2: V2, var v3: V3) = ...;
// ...
return var;
}
Other syntax for compound return forms
We considered other options for the syntax of compound return forms on 2025-05-13, #syntax in Discord on 2025-05-14, and 2025-05-14. The option of omitting the ->
in each component did not distinguish tuples from tuple return forms sufficiently:
-> (ref i32, var i32)
-> (bool, ref i32)
// Is this a single return of a tuple, or a triple
// return using the default return convention?
-> (i32, i32, i32)
We also considered an approach where compound return forms would start with ->?
, but this raised concerns about what the meaning of that syntax would be and whether we want to expose users to that in cases we might be able to avoid it.
The original proposed syntax used an arrow ->
in each component of a compound form.
fn TupleReturn(...)
-> (->val bool, ->ref i32, -> C);
fn StructReturn(...)
-> {->val .a: bool,
->ref .b: i32,
-> .c: C};
This avoided ambiguity, but was verbose and visually noisy. An alternative was suggested in discussion on 2025-06-13. This alternative used a default of compound returns for paren (
…)
and brace {
…}
expressions, which you could opt out of by using one of the three category keywords such as var
to introduce a type expression that would not be considered a compound return.
This had the downside that -> T
would be interpreted as -> var T
even if T
was a tuple type like (i32, i64)
. However, textually substituting in (i32, i64)
in for T
to get -> (i32, i64)
would instead be interpreted as -> (var i32, var i64)
.
To overcome this problem, in discussion on 2025-06-16, we switched to the approach from this proposal.
ref
parameters allow aliasing
If requiring ref
parameters to be noalias
ends up being too restrictive, we could instead have the “move-in-move-out” optimization be done only when the compiler can prove it safe. One strategy would be to generate an alternate version of the function that is only used in the cases where the noalias
conditions can be shown to hold statically.
let
to mark value returns instead of val
This proposal initially used let
instead of val
to mark immutable value returns. However, let
is used, in Carbon and other languages, primarily to bind names. In Carbon, the default binding is a value binding, but that was not considered a close enough to connect the let
keyword to value semantics.
There was also a concern about reusing let
in multiple contexts to mean different things, and having separate keywords that were only used to mark the category of the binding was deemed better separation of concerns.
We considered making a parallel change to use init
instead of var
, but this had some problems:
- By making initializing returns the default, there is little expected usage, so perhaps not worth spending another keyword on.
- The
init
keyword would be particularly expensive, because C++ code commonly use that word in APIs.
This question was considered in leads issue #5522.
=>
infers form, not just type
There was support for the idea that the =>
return syntax introduced in proposal #3848 should deduce the form of the return, not just its type. This was discussed in #lambdas on 2025-05-20.
However, trying to infer the expression category from the category of the expression after the =>
runs into the problem of this often requiring the parameters to be marked bound
. A more complicated rule, for example using whether any parameter is marked bound
, could be adopted in the future, if the simple rule proves inadequate.
Alternatively, the compiler could infer which parameters should be marked bound
in this case. That is something to consider with the memory safety design.