Carbon <-> C++ Interop: Primitive Types
Table of contents
Abstract
Define the type mapping of the primitive types between Carbon and C++.
Problem
Interoperability of Carbon with C++ is one of the Carbon language goals (see Interoperability with and migration from existing C++ code). Providing unsurprising mappings between C++ and Carbon types is one of it’s sub goals.
This proposal addresses the type mapping between the two languages to support achieving this goal.
Background
Data models
The following data models are widely accepted:
- 32-bit systems:
LP32
(Win16 API):int
16-bit;long
32-bit;pointer
32-bit.ILP32
(Win32 API; Unix and Unix-like systems):int
32-bit;long
32-bit;pointer
32-bit.
- 64-bit systems:
LLP64
(Win32 API: 64-bit ARM or x86-64):int
32-bit;long
32-bit;pointer
64-bit.LP64
(Unix and Unix-like systems (Linux, macOS)):int
32-bit;long
64-bit;pointer
64-bit.
Carbon will prioritize supporting modern OS, 64-bit little endian platforms (for example LLP64, LP64). Historic platforms like LP32 won’t be supported.
For clarity, the text below omits LP32 relevant information and focuses only on the Carbon supported platforms.
Carbon Primitive Types
Carbon has the following primitive types:
bool
: boolean type takingtrue
orfalse
- integer types:
- signed integer types:
iN
(N
- bit width, a positive multiple of 8)i8
,i16
,i32
,i64
,i128
,i256
- unsigned integer types:
uN
(N
- bit width, a positive multiple of 8)u8
,u16
,u32
,u64
,u128
,u256
- signed integer types:
- floating-point types:
fN
(N
- bit width, a positive multiple of 8), IEEE-754 formatf16
,f32
, andf64
- always availablef80
,f128
, orf256
may be available, depending on the platform
C++ Fundamental Types
C++ calls the primitive types fundamental types. The following fundamental types exist in C++:
void
std::nullptr_t
std::byte
- integral types (also integer types):
bool
- character types:
- narrow character types:
signed char
,unsigned char
,char
,char8_t
(c++20) - wide character types:
char16_t
,char32_t
,wchar_t
- narrow character types:
- signed integer types:
- standard signed integer types:
signed char
,short
,int
,long
,long long
- extended signed integer types (implementation-defined)
- standard signed integer types:
- unsigned integer types:
- standard unsigned integer types:
unsigned char
,unsigned short
,unsigned int
,unsigned long
,unsigned long long
- extended unsigned integer types
- standard unsigned integer types:
- floating-point types:
- standard floating-point types:
float
,double
,long double
- extended floating-point types:
- fixed width floating-point types (since C++23):
float16_t
,float32_t
,float64_t
,float128_t
,bfloat16_t
- other implementation-defined extended floating-point types
- fixed width floating-point types (since C++23):
- standard floating-point types:
void
Objects of type void
are not allowed, neither are arrays of void
, nor references to void
. Pointers to void
and functions returning void
are allowed.
std::nullptr_t
The type of nullptr
(the null pointer literal). It’s a distinct type that is not itself a pointer type.
std::byte
Type | Width in bits | Notes |
---|---|---|
std::byte | 8-bit | can be used to access raw memory, same as unsigned char , but it’s not a character type and is not an arithmetic type |
Character types
Type | Width in bits | Notes |
---|---|---|
char | 8-bit | multibyte characters; same representation, alignment and signedness as either signed char or unsigned char (platform-dependent), but it’s a distinct type |
signed char | 8-bit | signed character representation |
unsigned char | 8-bit | unsigned character representation; raw memory access |
char8_t | 8-bit | UTF-8 character representation; same size, alignment and signedness as unsigned char , but a distinct type |
char16_t | 16-bit | UTF-16 character representation; same size, alignment and signedness as std::uint_least16_t , but a distinct type |
char32_t | 32-bit | UTF-32 character representation; same size, alignment and signedness as std::uint_least32_t , but a distinct type |
wchar_t | 32-bit on Linux, 16-bit on Windows | wide character representation, holds UTF-32 on Linux and other non-Windows platforms, UTF-16 on Windows. |
Signed integer types
Standard signed integer types
Type | Width in bits |
---|---|
signed char | 8-bit |
short | 16-bit |
int | 32-bit |
long | LLP64: 32-bit; LP64: 64-bit |
long long | 64-bit |
Exact-width integer types
Typically aliases of the standard integer types.
Type | Width in bits | Defined as |
---|---|---|
std::int8_t | 8-bit | typedef signed char int8_t |
std::int16_t | 16-bit | typedef signed short int16_t |
std::int32_t | 32-bit | typedef signed int int32_t |
std::int64_t | 64-bit | LLP64: typedef signed long long int64_t |
LP64: typedef signed long int64_t |
Fastest minimum-width integer types
Integer types that are usually fastest to operate with among all integer types that have the minimum specified width.
Type | Width in bits | Defined as |
---|---|---|
std::int_fast8_t | >=8-bit | typedef signed char int_fast8_t |
std::int_fast16_t | >=16-bit | implementation dependent |
std::int_fast32_t | >=32-bit | implementation dependent |
std::int_fast64_t | >=64-bit | LLP64: typedef signed long long int_fast64_t |
LP64: typedef signed long int_fast64_t |
Minimum-width integer types
Smallest signed integer type with width of at least N-bits.
Type | Width in bits | Defined as |
---|---|---|
std::int_least8_t | >=8-bit | typedef signed char int_least8_t |
std::int_least16_t | >=16-bit | typedef short int_least16_t |
std::int_least32_t | >=32-bit | typedef int int_least32_t |
std::int_least64_t | >=64-bit | LLP64: typedef signed long long int_least64_t |
LP64: typedef signed long int_least64_t |
Greatest-width integer types
Maximum-width signed integer type.
Type | Width in bits | Defined as |
---|---|---|
std::intmax_t | >=32-bit | LLP64: typedef signed long long intmax_t |
LP64: typedef signed long intmax_t |
Integer types capable of holding object pointers
Signed integer type, capable of holding any pointer.
Type | Width in bits | Defined as |
---|---|---|
std::intptr_t | >=16-bit | most platforms: typedef long intptr_t |
some ILP32:typedef int intptr_t |
Other signed integer types
Type | Width in bits | Defined as |
---|---|---|
ptrdiff_t | >=16-bit | most platforms: typedef std::intptr_t ptrdiff_t |
Holds the result of subtracting two pointers. |
Unsigned integer types
The unsigned integer types have the same sizes as their signed counterparts.
Type | Width in bits | Defined as |
---|---|---|
size_t | >=16-bit | most platforms: typedef uintptr_t size_t |
Holds the result of the sizeof operator. |
Floating-point types
Standard floating-point types
Type | Format | Width in bits | Note |
---|---|---|---|
float | usually IEEE-754 binary32 | 32-bits | The format or the size can vary depending on the compiler and the platform. |
double | usually IEEE-754 binary64 | 64-bits | The format or the size can vary depending on the compiler and the platform. |
long double | IEEE-754 binary128 | 128-bit | used by some SPARC, MIPS, ARM64 implementations. |
IEEE-754 binary64-extended format | 80-bit or 64-bit | 80-bit (most x86 and x86-64 implementations); 64-bit used by MSVC. | |
double-double | 128-bit | used on PowerPC. |
Fixed-width floating-point types (C++23)
They aren’t aliases to the standard floating-point types (float
, double
, long double
), but to an extended floating-point type.
Type | Width in bits | Defined as |
---|---|---|
std::float16_t | 16-bit | using float16_t = _Float16 |
std::float32_t | 32-bit | using float32_t = _Float32 |
std::float64_t | 64-bit | using float64_t = _Float64 |
std::float128_t | 128-bit | using float128_t = _Float128 |
std::bfloat16_t | 16-bit |
Proposal
- The C++ fixed-width integer types
intN_t
will be the same type as Carbon integer typesiN
. Likewise foruintN_t
<->uN
. - A C++
builtin type
will be available in Carbon asCpp.builtin_type
, for the standard C++ signed/unsigned integer and floating-point types. - A C++ integer
builtin type
that is not the same asintN_t
oruintN_t
for any N, will be nameable in Carbon only asCpp.builtin_type
. - Different C++ types will be considered different in Carbon, so C++ overload resolution can be handled without issues.
Details
The table of Carbon <-> C++ mappings is as follows:
Carbon | C++ |
---|---|
() as a return type | void |
bool | bool |
i8 | int8_t |
i16 | int16_t |
i32 | int32_t |
i64 | int64_t |
i128 | int128_t |
u8 | uint8_t |
u16 | uint16_t |
u32 | uint32_t |
u64 | uint64_t |
u128 | uint128_t |
Cpp.signed_char | signed char |
Cpp.short | short |
Cpp.int | int |
Cpp.long | long |
Cpp.long_long | long long |
Cpp.unsigned_char | unsigned char |
Cpp.unsigned_short | unsigned short |
Cpp.unsigned_int | unsigned int |
Cpp.unsigned_long | unsigned long |
Cpp.unsigned_long_long | unsigned long long |
Cpp.float | float |
Cpp.double | double |
Cpp.long_double | long double |
f16 | std::float16_t (_Float16) |
f128 | std::float128_t (_Float128) |
TBD | float32_t , float64_t , bfloat16_t |
TBD | char , charN_t , wchar_t |
TBD | std::byte |
TBD | std::nullptr_t |
In addition to the exact mappings above, the following are expected to be the same type due to the different spellings of the types in C++ being the same:
Carbon type | C++ type |
---|---|
i8 | signed char |
u8 | unsigned char |
i16 | short |
u16 | unsigned short |
i32 | int |
u32 | unsigned int |
i64 | long or long long |
u64 | unsigned long or unsigned long long |
C++ -> Carbon mapping details
- C++
intN_t
type will be considered the same type as Carbon’siN
type. Likewise foruintN_t
<->uN
. -
C++
builtin type
will be available in Carbon inside theCpp
namespace under the nameCpp.builtin_type
, for the standard signed/unsigned integer and floating-point types.-
The names will follow the pattern:
Cpp.[unsigned_](long_long|long|int|short|double|float)
that is signedness, then size keyword(s), then a type keyword only if there are no size keywords. For example
Cpp.unsigned_int
notCpp.unsigned
,Cpp.long
notCpp.long_int
. -
They will be available when an
import Cpp
declaration is present. -
Name collision: This naming may cause name collisions if such a name already exist in the unnamed C++ namespace. We consider this not to be a common case and would not support such cases, for the benefit of having the C++-specific stuff in the package
Cpp
. -
Cpp.builtin_type
will be the same type asiN
/uN
, if the corresponding C++builtin type
is the same asintN_t
/uintN_t
on that platform. Otherwise it will be available in Carbon as a new, distinct type that is compatible with some of theiN
/uN
types. For example:-
If
int32_t
is the same type asint
, thenCpp.int
will be the same type asi32
. -
If
int64_t
is the same type aslong
, thenCpp.long
will be the same type asi64
.Cpp.long_long
will be a different type, compatible withi64
.
-
-
Cpp.float
andCpp.double
will be the same type asf32
andf64
correspondingly.
-
- The type aliases
[u]int_fastN_t
,[u]int_leastN_t
,[u]intmax_t
,[u]intptr_t
,ptrdiff_t
andsize_t
will be available in Carbon in theCpp
namespace if the C++ header declaring them is imported (for example<stdint.h>
,<cstdint>
etc), with names likeCpp.[u]int_fastN_t
,Cpp.[u]int_leastN_t
,Cpp.size_t
etc. No special support will be provided.
Carbon -> C++ mapping details
- Same as above, Carbon
iN
/uN
types will map to the C++intN_t
/uintN_t
types. f32
/f64
will map tofloat
/double
correspondingly.f16
/f128
will map tostd::float16_t (_Float16)
/std::float128_t (_Float128)
correspondingly.- Some Carbon types may not have direct mappings in C++:
i256
,u256
,f80
,f256
.
Rationale
One of Carbon’s goals is seamless interoperability with C++ (see Interoperability with and migration from existing C++ code), calling for clarity of the calls and high performance.
The proposal maps the Carbon types to their direct equivalents in C++, with zero overhead, supporting the request for unsurprising mappings between C++ and Carbon types with high performance.
Alternatives considered
Naming of new types:
- Allow all keyword permutations.
- Reason not to do this: unnecessary and complicated.
- Only include the keywords, and provide some syntax for combining them (eg,
Cpp.unsigned
&Cpp.long
orCpp.unsigned(Cpp.long)
).- Reason to do this: avoids taking any identifiers from Cpp that are not C++ keywords.
- Reason not to do this: overly complicated.
- Use
Core.Cpp.T
instead ofCpp.T
.- Reason to do this: avoid name collisions with C++ code.
- Reason not to do this: The name collisions should not be a problem in practice, and would prefer to keep C++-specific stuff in package Cpp.
long
Cpp.long
andCpp.long_long
both map to Carbon types that are distinct fromiN
for anyN
, but are compatible with eitheri32
ori64
as appropriate.- Reason to not do this: unnecessary conversions and handling
long
andlong long
differently than the other C++ types.
- Reason to not do this: unnecessary conversions and handling
- Provide platform-dependent conversion functions for
long
.- Reason to do this: the conversions will be clearly outlined.
- Reason not to do this: performance overhead for certain platforms.
- Map
long
always to a fixed-sized Carbon type depending on the platform (for example to eitheri32
ori64
)- Reason to do this: all the code will be using fixed-sized types.
- Reason not to do this: the same C++ function may map differently on different platforms and the Carbon code should compensate for that to make the code compile.
float32_t
, float64_t
- Map
f32
<->float32_t
andf64
<->float64_t
- Reason to do this: follow the same analogy as for the integer types (
iN
<->intN_t
) - Reason not to do this:
float32_t
,float64_t
are new types since C++23, so this won’t be directly achievable, but the corresponding_FloatN
types will need to be used for the older C++ versions.- they are not aliases for the standard floating-point types (
float
,double
,long double
), but for extended floating-point types, so type conversions will be needed for the standard types.
- Reason to do this: follow the same analogy as for the integer types (
Open questions
The mapping of the following types remains open and will be discussed at a later point:
char
,char8_t
,char16_t
,char32_t
,wchar_t
- Carbon still doesn’t have character types, so the mapping of these types will be discussed once they are available.
- These are all distinct types in C++, which should be taken into account to prevent any issues for overloading.
std::byte
std::nullptr_t
void*
Cpp.long_double
- details of this new type is still to be discussed.float32_t
,float64_t
,bfloat16_t
.