Explorer structured fuzzer
Overview
Fuzz testing is based on generating a large amount of random inputs for a software component in order to trigger bugs and unexpected behavior. Basic fuzzing uses randomly generated arrays of bytes as inputs, which works great for some applications but is problematic for testing the logic that operates on highly structured data, as most random inputs are immediately rejected as invalid before any interesting parts of the code get a chance to run.
Structured fuzzing addresses this issue by ensuring the randomly generated data is itself structured, and as such has a high chance of presenting a valid input.
explorer_fuzzer
is a structured fuzzer based on libprotobuf-mutator, which is a library to randomly mutate protobuffers.
The input to the fuzzer is an instance of Carbon::Fuzzing::Carbon
proto randomly generated by the libprotobuf-mutator
framework. explorer_fuzzer
converts the proto to a Carbon source code string, and tries to parse and execute the code using explorer
implementation.
Fuzzer data format
libprotobuf-mutator
supports fuzzer inputs in either text or binary protocol buffer format. explorer_fuzzer
uses text proto format with Carbon
proto message definition in testing/fuzzing/carbon.proto
.
Incorporating AST changes into the fuzzer
Fuzzer AST representation in carbon.proto needs to be updated when changes are made to the AST, like adding a new AST node classes or changing relevant data members of existing nodes.
ast_to_proto_test normally should not require direct changes, as tests work off of Carbon test files in testdata.
To incorporate AST changes into fuzzing logic:
-
Add appropriate AST information to carbon.proto. Use existing similar cases as examples.
-
Modify proto_to_carbon.cpp which handles printing of a Carbon proto instance as a Carbon source string. For example, add code to print newly introduced proto fields.
-
Add logic to populate the proto to ast_to_proto.cpp.
-
Make sure
bazel test //explorer/fuzzing:ast_to_proto_test
passes with the new changes.
Running the fuzzer
The fuzzer can be run in ‘unit test’ mode, where the fuzzer executes on each input file from the fuzzer_corpus/
folder, or in ‘fuzzing’ mode, where the fuzzer will keep generating random inputs and executing the logic on them until a crash is triggered, or forever in a bug-free program ;).
To run in ‘unit test’ mode:
bazel test //explorer/fuzzing:explorer_fuzzer
To run in ‘fuzzing’ mode:
bazel build --config=fuzzer //explorer/fuzzing:explorer_fuzzer.full_corpus
bazel-bin/explorer/fuzzing/explorer_fuzzer.full_corpus
Investigating a crash
Typically it’s going to be easiest to run explorer on the problematic carbon program directly. You can do this with:
# Convert a specific fuzzer test to a source file
bazel run //testing/fuzzing:proto_to_carbon -- explorer/fuzzing/fuzzer_corpus/abcd1234 > crash.carbon
# Or convert the crash to a source file.
bazel run //testing/fuzzing:proto_to_carbon -- /tmp/crash.textproto > crash.carbon
# Run explorer on the crash.
bazel run //explorer -- crash.carbon
It’s also possible to run the fuzzer on a single input:
bazel-bin/explorer/fuzzing/explorer_fuzzer.full_corpus /tmp/crash.textproto
Generating new fuzzer corpus entries
The ability of the fuzzing framework to generate ‘interesting’ inputs can be improved by providing ‘seed’ inputs known as the fuzzer corpus. The inputs need to be a Fuzzing::Carbon
text proto.
To generate a text proto from Carbon source:
bazel run //explorer/fuzzing:ast_to_proto -- /tmp/crash.carbon > crash.textproto