Language-level safety strategy

Problem
Background
Proposal
Open question: probabilistic checks in the hardened build mode
Rationale
- Open questions
  - Does the core team believe that we should put a cap on how much performance should be sacrificed for safety, putting more emphasis on probabilistic methods that would allow more attacks through?

Problem

Carbon needs to have a clear and consistent strategy for approaching the problems of language-level safety. These problems have been persistent and growing sources of both bugs and security vulnerabilities in C and C++ software. Failure to effectively and carefully address safety concerns is likely to undermine any hope of Carbon being a successful path forward for today’s C++ users.

Background

Fearless Security: Memory Safety
A proactive approach to more secure code
Chromium memory safety
MemSafe
- Notably introduces the terms “spatial” and “temporal” safety.

Proposal

We propose a safety strategy for Carbon that aims for incrementally increasing the compile-time proven safety while allowing for dynamic checks to cover what remains. It also prioritizes dynamic safety checks that are amenable to being optimized away or being manually disabled for performance-critical use cases where the added dynamic protections are not a viable trade-off.

Open question: probabilistic checks in the hardened build mode

This proposal explicitly discourages probabilistic checks in the hardened build mode because they won’t reliably prevent security attacks. Does the core team believe that we should put a cap on how much performance should be sacrificed for safety, putting more emphasis on probabilistic methods that would allow more attacks through?

For example, heap use-after-free detection with 100% accuracy is expected to be a significant expense for hardened builds. MarkUs is estimated to cost 10-50% CPU and 25% RAM in order to catch 100% of issues. For comparison, MTE is estimated to cost 0-20% CPU and 3-6% RAM in order to catch 93% of issues.

The CPU and RAM cost of MarkUs is significant, even by comparison with other techniques, and costs will add up as more safety is added. 93% is a reasonably high detection rate for an performance-efficient, probabilistic technique. Would the core team expect to use MTE instead MarkUs in hardened builds?

Rationale

Most of Carbon’s goals can be addressed in a somewhat piecemeal fashion. For example, there’s probably no need for our designs for generics and for sum types to coordinate how they address performance or readability. Safety, on the other hand, is much more cross-cutting, and so it’s important for us to approach it consistently across the whole language. This proposal gives us a common vocabulary for discussing safety, establishes some well-motivated common principles, and provides an overall strategy based on those principles, all of which will be essential to achieving that consistency.

This proposal gives a solid basis for thinking about safety in future proposals. The pragmatic choice to focus on the security aspects of safety seems well-aligned with Carbon’s goals. In particular, a more idealistic approach to safety, in which every language construct has bounded behavior, would be likely to result in safety being prioritized over performance. By instead considering safety largely from the perspective of security vulnerabilities, and accepting that there will be cases where the pragmatic choice is hardening rather than static or dynamic checks, we can focus on delivering practical safety without being overly distracted from other goals.

It is critical that Carbon use build modes to enable writing performance-optimized code (much like C++ today) that can still be built and deployed at reasonable engineering cost with strong guarantees around memory safety for the purposes of security. We think this proposal provides that foundation.

Note: The decision by the core team included a request to clarify the wording on build modes as described in chandlerc’s post (broken link: https://forums.carbon-lang.dev/t/request-for-decision-language-level-safety-strategy/196/6) on the decision thread.

Open questions

Does the core team believe that we should put a cap on how much performance should be sacrificed for safety, putting more emphasis on probabilistic methods that would allow more attacks through?

There is no cap at this point. Where possible, static checking should be used. In practice, there will be some who will not use the safest option until the cost gets low enough.