C++ interoperability goals
Table of contents
- Problem
- Background
- Proposal
- Philosophy
- Language goal influences
- Performance-critical software
- Software and language evolution
- Code that is easy to read, understand, and write
- Practical safety guarantees and testing mechanisms
- Fast and scalable development
- Modern OS platforms, hardware architectures, and environments
- Interoperability with and migration from existing C++ code
- Goals
- Non-goals
- Open questions to be resolved later
- Rationale
Problem
Carbon’s goals include “Interoperability with and migration from existing C++ code”. This proposal aims to provide additional detail to that goal, beyond what makes sense to have in the main goals document.
Background
Interoperability and migration are key to Carbon. However, performance and evolution are higher priorities. This proposal aims to outline at what this interaction of priorities should end up looking like in Carbon, as well as indicating a couple trade-offs that will occur.
Other language interoperability layers that may offer useful examples are:
-
Java/Kotlin should be a comparable interoperability story. The languages are different, but share an underlying runtime. This may be closest to the model we desire for Carbon.
-
JavaScript/TypeScript is similar to C/C++, where one language is essentially a subset of the other, allowing high interoperability. This is an interesting reference point, but we are looking at a different approach with a clearer boundary.
-
C++/Java is an example of requiring specialized code for the bridge layer, making interoperability more burden on developers. The burden of the approach may be considered to correspond to the difference in language memory models and other language design choices. Regardless, the result can be considered higher maintenance for developers than we want for Carbon.
-
C++/Go is similar to C++/Java. However, Go notably allows C++ bridge code to exist in the .go files, which can ease maintenance of the bridge layer, and is desirable for Carbon.
Proposal
Philosophy
The C++ interoperability layer of Carbon is the section wherein a specific, restricted set of C++ APIs can be expressed in a way that’s callable from Carbon, and similar for calling Carbon from C++. This requires expressing one language as a subset of the other. Our goal is that the constraint of expressivity is loose enough that the resulting amount of bridge code is sustainable.
The design for interoperability between Carbon and C++ hinges on:
- The ability to interoperate with a wide variety of code, such as classes/structs and templates, not just free functions.
- A willingness to expose the idioms of C++ into Carbon code, and the other way around, when necessary to maximize performance of the interoperability layer.
- The use of wrappers and generic programming, including templates, to minimize or eliminate runtime overhead.
These things come together when looking at how custom data structures in C++ are exposed into Carbon, and the other way around. In both languages, it is reasonable and even common to have customized low-level data structures, such as associative containers. For example, there are numerous data structures for mapping from a key to a value that might be best for a particular use case, including hash tables, linked hash tables, sorted vectors, and btrees. Even for a given data structure, there may be slow but meaningful evolution in implementations strategies.
The result is that it will often be reasonable to directly expose a C++ data structure to Carbon without converting it to a “native” or “idiomatic” Carbon data structure. Although interfaces may differ, a trivial adapter wrapper should be sufficient. Many Carbon data structures should also be able to support multiple implementations with C++ data structures being one such implementation, allowing for idiomatic use of C++ hidden behind Carbon.
The reverse is also true. C++ code will often not care, or can be refactored to not care, what specific data structure is used. Carbon data structures can be exposed as yet another implementation in C++, and wrapped to match C++ idioms and even templates.
For example, a C++ class template like std::vector<T>
should be usable without wrapper code or runtime overhead, and passing a Carbon type as T
. The resulting type should be equally usable from either C++ or Carbon code. It should also be easy to wrap std::vector<T>
with a Carbon interface for transparent use in idiomatic Carbon code.
Language goal influences
Performance-critical software
Interoperability with C++ will be frequently used in Carbon, whether it’s C++ developers trying out Carbon, incrementally migrating a large C++ codebase, or continuing to use a C++ library long-term. In all cases, it must be possible to write interoperable code with zero overhead; copies must not be required.
Software and language evolution
Interoperability will require the addition of features to Carbon which exist primarily to support interoperability use cases. However, these features must not unduly impinge the overall evolution of Carbon. In particular, only a subset of Carbon features will support interoperability with C++. To do otherwise would restrict Carbon’s feature set.
Code that is easy to read, understand, and write
Interoperability-related Carbon code will likely be more difficult to read than other, more idiomatic Carbon code. This is okay: aiming to make Carbon code readable doesn’t mean that it needs to all be trivial to read. At the same time, the extra costs that interoperability exerts on Carbon developers should be minimized.
Practical safety guarantees and testing mechanisms
Safety is important to maintain around interoperability code, and mitigations should be provided where possible. However, safety guarantees will be focused on native Carbon code. C++ code will not benefit from the same set of safety mechanisms that Carbon offers, so Carbon code calling into C++ will accept higher safety risks.
Fast and scalable development
The interoperability layer will likely have tooling limitations similar to C++. For example, Carbon aims to compile quickly. However, C++ interoperability hinges on compiling C++ code, which is relatively slow. Carbon libraries that use interoperability will see bottlenecks from C++ compile time. Improving C++ is outside the scope of Carbon.
Modern OS platforms, hardware architectures, and environments
Interoperability will apply to the intersection of environments supported by both Carbon and C++. Pragmatically, Carbon will likely be the limiting factor here.
Interoperability with and migration from existing C++ code
Carbon’s language goal for interoperability will focus on C++17 compatibility. The language design must be mindful of the prioritization; trade-offs harming other goals may still be made so long as they offer greater benefits for interoperability and Carbon as a whole.
Although the below interoperability-specific goals will focus on interoperability, it’s also important to consider how migration would be affected. If interoperability requires complex work, particularly to avoid performance impacts, it could impair the ability to incrementally migrate C++ codebases to Carbon.
Goals
Support mixing Carbon and C++ toolchains
The Carbon toolchain will support compiling C++ code. It will contain a customized C++ compiler that enables some more advanced interoperability features, such as calling Carbon templates from C++.
Mixing toolchains will also be supported in both directions:
-
C++ libraries compiled by a non-Carbon toolchain will be usable from Carbon, so long as they are ABI-compatible with Carbon’s C++ toolchain.
-
The Carbon toolchain will support, as an option, generating a C++ header and object file from a Carbon library, with an ABI that’s suitable for use with non-Carbon toolchains.
Mixing toolchains restricts functionality to what’s feasible with the C++ ABI. For example, developers should expect that Carbon templates will be callable from C++ when using the Carbon toolchain, and will not be available when mixing toolchains because it would require a substantially different and more complex interoperability implementation. This degraded interoperability should still be sufficient for most developers, albeit with the potential of more bridge code.
Any C++ interoperability code that works when mixing toolchains must work when using the native Carbon toolchain. The mixed toolchain support must not have semantic divergence. The converse is not true, and the native Carbon toolchain may have additional language support and optimizations.
Compatibility with the C++ memory model
It must be straightforward for any Carbon interoperability code to be compatible with the C++ memory model. This does not mean that Carbon must exclusively use the C++ memory model, only that it must be supported.
Minimize bridge code
The majority of simple C++ functions and types should be usable from Carbon without any custom bridge code and without any runtime overhead. That is, Carbon code should be able to call most C++ code without any code changes to add support for interoperability, even if that code was built with a non-Carbon toolchain. This includes instantiating Carbon templates or generics using C++ types.
In the other direction, Carbon may need some minimal markup to expose functions and types to C++. This should help avoid requiring Carbon to generate C++-compatible endpoints unconditionally, which could have compile and linking overheads that may in many cases be unnecessary. Also, it should help produce errors that indicate when a function or type may require additional changes to make compatible with C++.
Our priority is that Carbon developers should be able to easily reuse the mature ecosystem of C++ libraries provided by third-parties. A third-party library’s language choice should not be a barrier to Carbon adoption.
Even for first-party libraries, migration of C++ codebases to Carbon will often be incremental due to human costs of executing and verifying source migrations. Minimizing the amount of bridge code required should be expected to simplify such migrations.
Unsurprising mappings between C++ and Carbon types
Carbon will provide unsurprising mappings for common types.
Primitive types will have mappings with zero overhead conversions. They are frequently used, making it important that interoperability code be able to use them seamlessly.
The storage and representation will need to be equivalent in both languages. For example, if a C++ __int64
maps to Carbon’s Int64
, the memory layout of both types must be identical.
Semantics need to be similar, but edge-case behaviors don’t need to be identical, allowing Carbon flexibility to evolve. For example, where C++ would have modulo wrapping on integers, Carbon could instead have trapping behavior on the default-mapped primitive types.
Carbon may have versions of these types with no C++ mapping, such as Int256
.
Non-owning vocabulary types, such as pointers and references, will have transparent, automatic translation between C++ and Carbon non-owning vocabulary types with zero overhead.
Other vocabulary types will typically have reasonable, but potentially non-zero overhead, conversions available to map into Carbon vocabulary types. Code using these may choose whether to pay the overhead to convert. They may also use the C++ type directly from Carbon code, and the other way around.
Incomplete types must have a mapping with similar semantics, similar to primitive types.
Allow C++ bridge code in Carbon files
Carbon files should support inline bridge code written in C++. Where bridge code is necessary, this will allow for maintenance of it directly alongside the code that uses it.
Carbon inheritance from C++ types
Carbon will support inheritance from C++ types for interoperability, although the syntax constructs may look different from C++ inheritance. This is considered necessary to address cases where a C++ library API expects users to inherit from a given C++ type.
This might be restricted to pure interface types; see the open question.
Support use of advanced C++ features
There should be support for most idiomatic usage of advanced C++ features. A few examples are templates, overload sets, attributes and ADL.
Although these features can be considered “advanced”, their use is widespread throughout C++ code, including STL. Support for such features is key to supporting migration from C++ features.
Support basic C interoperability
C interoperability support must be sufficient for Carbon code to call popular APIs that are written in C. The ability of C to call Carbon will be more restricted, limited to where it echoes C++ interoperability support. Basic C interoperability will include functions, primitive types, and structs that only contain member variables.
Features where interoperability will rely on more advanced C++-specific features, such as templates, inheritance, and class functions, need not be supported for C. These would require a C-specific interoperability model that will not be included.
Non-goals
Full parity between a Carbon-only toolchain and mixing C++/Carbon toolchains
Making mixed C++/Carbon toolchain support equivalent to Carbon-only toolchain support affects all interoperability features. Mixed toolchains will have degraded support because full parity would be too expensive.
The feature of calling Carbon templates from C++ code is key when analyzing this option. Template instantiation during compilation is pervasive in C++.
With a Carbon toolchain compiling both Carbon and C++ code, the C++ compiler can be modified to handle Carbon templates differently. Carbon templates can be handled by exposing the Carbon compiler’s AST to the C++ compiler directly, as a compiler extension. While this approach is still complex and may not always work, it should offer substantial value and ability to migrate C++ code to Carbon without requiring parallel maintenance of implementations in C++.
With a mixed toolchain, the C++ compiler cannot be modified to handle Carbon templates differently. The only way to support template instantiation would be by having Carbon templates converted into equivalent C++ templates in C++ headers; in other words, template support would require source-to-source translation. Supporting Carbon to C++ code translations would be a complex and high cost feature to achieve full parity for mixed toolchains. Requiring bridge code for mixed toolchains is the likely solution to avoid this cost.
Note that this issue differs when considering interoperability for Carbon code instantiating C++ templates. The C++ templates must be in C++ headers for reuse, which in turn must compile with the Carbon toolchain to reuse the built C++ code, regardless of whether a separate C++ toolchain is in use. This may also be considered a constraint on mixed toolchain interoperability, but it’s simpler to address and less likely to burden developers.
To summarize, developers should expect that while most features will work equivalently for mixed toolchains, there will never be full parity.
Never require bridge code
Corner cases of C++ will not receive equal support to common cases: the complexity of supporting any given construct must be balanced by the real world need for that support. For example:
-
Interoperability will target C++17. Any interoperability support for future versions of C++, including features such as C++20 modules, will be based on a cost-benefit analysis. Exhaustive support should not be assumed.
-
Support will be focused on idiomatic code, interfaces, and patterns used in widespread open source libraries or by other key constituencies. C++ code will have edge cases where the benefits of limiting Carbon’s maintenance costs by avoiding complex interoperability outweighs the value of avoiding bridge code.
-
Support for low-level C ABIs may be focused on modern 64-bit ABIs, including Linux, POSIX, and a small subset of Windows’ calling conventions.
Convert all C++ types to Carbon types
Non-zero overhead conversions should only be supported, never required, in order to offer reliable, unsurprising performance behaviors. This does not mean that conversions will always be supported, as support is a cost-benefit decision for specific type mappings. For example, consider conversions between std::vector<T>
and an equivalent, idiomatic Carbon type:
-
Making conversions zero-overhead would require the Carbon type to mirror the memory layout and implementation semantics of
std::vector<T>
. However, doing so would constrain the evolution of the Carbon type to match C++. Although some constraints are accepted for most primitive types, it would pose a major burden on Carbon’s evolution to constrain Carbon’s types to match C++ vocabulary type implementations. -
These conversions may not always be present, but
std::vector<T>
is a frequently used type. As a result, it can be expected that there will be functions supporting a copy-based conversion to the idiomatic Carbon type. -
An interface which can hide the difference between whether
std::vector<T>
or the equivalent, idiomatic Carbon type is in use may also be offered for common types. -
It will still be normal to handle C++ types in Carbon code without conversions. Developers should be given the choice of when to convert.
Support for C++ exceptions without bridge code
Carbon may not provide seamless interoperability support for C++ exceptions. For example, translating C++ exceptions to or from Carbon errors might require annotations or bridge code, and those translations may have some performance overhead or lose information. Furthermore, if Carbon code calls a C++ function without suitable annotations or bridging, and that function exits with an exception, the program might terminate.
Cross-language metaprogramming
Carbon’s metaprogramming design will be more restrictive than C++’s preprocessor macros. Although interoperability should handle simple cases, such as #define STDIN_FILENO 0
, complex metaprogramming libraries may require a deep ability to understand code rewrites. It should be reasonable to have these instead rewritten to use Carbon’s metaprogramming model.
Open questions to be resolved later
Carbon type inheritance from non-pure interface C++ types
Some C++ APIs will expect that consumers use classes that inherit from a type provided by the API. It’s desirable to have Carbon support, in some way, inheritance from API types in order to use these APIs.
It may be sufficient to require the parent type be a pure interface, and that APIs with either use bridge code or switch implementations. That will be determined later.
CRTP support
Although CRTP is a common technique in C++, interoperability support may require substantial work. Libraries based on use of CRTP may require bridge code or a rewrite for Carbon interoperability.
More analysis should be done on the cost-benefit of supporting CRTP before making a support decision.
Object lifetimes
Carbon may have a different object lifetime design than C++. For example, Carbon may choose different rules for determining the lifetime of temporaries. This could affect idiomatic use of C++ APIs, turning code that would be safe in C++ into unsafe Carbon code, requiring developers to learn new coding patterns.
More analysis should be done on object lifetimes and potential Carbon designs for it before deciding how to treat object lifetimes in the scope of interoperability.
Rationale
The core team agrees with the mapping of the proposed interoperability goals to the overall Carbon goals as rationale.
Embedding of C++ bridge code inside Carbon code is inspired in part by experience with CUDA and SYCL, where users provided feedback that these were preferred over alternatives such as OpenCL due to the ability to embed bridge code and device kernels in the same source file as host C++ code. It also matches the approach of bridging between C++ and C where bridge code is often embedded as extern "C"
within C++ source files.
The non-goals seem appropriately motivated by the non-interoperability priorities, without undermining the effectiveness of the interoperability as a whole.
We agree with ensuring some level of C interoperability in order to effectively interoperate with both some parts of the C++ ecosystem that rely on the C ABI as well as all other languages which target C-based interoperability.
Open questions
All open questions in the document are intended to be answered later after we have more experience.