-
-
Notifications
You must be signed in to change notification settings - Fork 403
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial OSS-Fuzz Integration and First Fuzzing Test
Introduces an initial fuzzing test and supporting files for integrating Dulwich into OSS-Fuzz as discussed in: #1302 The corresponding PR on the OSS-Fuzz repo is: google/oss-fuzz#11900
- Loading branch information
Showing
7 changed files
with
358 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,190 @@ | ||
# Fuzzing Dulwich | ||
|
||
[![Fuzzing Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/dulwich.svg)][oss-fuzz-issue-tracker] | ||
|
||
This directory contains files related to Dulwich's suite of fuzz tests that are executed daily on automated | ||
infrastructure provided by [OSS-Fuzz][oss-fuzz-repo]. This document aims to provide necessary information for working | ||
with fuzzing in Dulwich. | ||
|
||
The latest details regarding OSS-Fuzz test status, including build logs and coverage reports, is available | ||
on [the Open Source Fuzzing Introspection website](https://introspector.oss-fuzz.com/project-profile?project=dulwich). | ||
|
||
## How to Contribute | ||
|
||
There are many ways to contribute to Dulwich's fuzzing efforts! Contributions are welcomed through issues, | ||
discussions, or pull requests on this repository. | ||
|
||
Areas that are particularly appreciated include: | ||
|
||
- **Tackling the existing backlog of open issues**. While fuzzing is an effective way to identify bugs, that information | ||
isn't useful unless they are fixed. If you are not sure where to start, the issues tab is a great place to get ideas! | ||
- **Improvements to this (or other) documentation** make it easier for new contributors to get involved, so even small | ||
improvements can have a large impact over time. If you see something that could be made easier by a documentation | ||
update of any size, please consider suggesting it! | ||
|
||
For everything else, such as expanding test coverage, optimizing test performance, or enhancing error detection | ||
capabilities, jump into the "Getting Started" section below. | ||
|
||
## Getting Started with Fuzzing Dulwich | ||
|
||
> [!TIP] | ||
> **New to fuzzing or unfamiliar with OSS-Fuzz?** | ||
> | ||
> These resources are an excellent place to start: | ||
> | ||
> - [OSS-Fuzz documentation][oss-fuzz-docs] - Continuous fuzzing service for open source software. | ||
> - [Google/fuzzing][google-fuzzing-repo] - Tutorials, examples, discussions, research proposals, and other resources | ||
related to fuzzing. | ||
> - [CNCF Fuzzing Handbook](https://github.com/cncf/tag-security/blob/main/security-fuzzing-handbook/handbook-fuzzing.pdf) - | ||
A comprehensive guide for fuzzing open source software. | ||
> - [Efficient Fuzzing Guide by The Chromium Project](https://chromium.googlesource.com/chromium/src/+/main/testing/libfuzzer/efficient_fuzzing.md) - | ||
Explores strategies to enhance the effectiveness of your fuzz tests, recommended for those looking to optimize their | ||
testing efforts. | ||
|
||
### Setting Up Your Local Environment | ||
|
||
Before contributing to fuzzing efforts, ensure Python and Docker are installed on your machine. Docker is required for | ||
running fuzzers in containers provided by OSS-Fuzz. [Install Docker](https://docs.docker.com/get-docker/) following the official guide if you do not already have it. | ||
|
||
### Understanding Existing Fuzz Targets | ||
|
||
Review the `fuzz-targets/` directory to familiarize yourself with how existing tests are implemented. See | ||
the [Files & Directories Overview](#files--directories-overview) for more details on the directory structure. | ||
|
||
### Contributing to Fuzz Tests | ||
|
||
Start by reviewing the [Atheris documentation][atheris-repo] and the section | ||
on [Running Fuzzers Locally](#running-fuzzers-locally) to begin writing or improving fuzz tests. | ||
|
||
## Files & Directories Overview | ||
|
||
The `fuzzing/` directory is organized into three key areas: | ||
|
||
### Fuzz Targets (`fuzz-targets/`) | ||
|
||
Contains Python files for each fuzz test. | ||
|
||
**Things to Know**: | ||
|
||
- Each fuzz test targets a specific part of Dulwich's functionality. | ||
- Test files adhere to the naming convention: `fuzz_<API Under Test>.py`, where `<API Under Test>` indicates the | ||
functionality targeted by the test. | ||
- Any functionality that involves performing operations on input data is a possible candidate for fuzz testing, but | ||
features that involve processing untrusted user input or parsing operations are typically going to be the most | ||
interesting. | ||
- The goal of these tests is to identify previously unknown or unexpected error cases caused by a given input. For that | ||
reason, fuzz tests should gracefully handle anticipated exception cases with a `try`/`except` block to avoid false | ||
positives that halt the fuzzing engine. | ||
|
||
### Dictionaries (`dictionaries/`) | ||
|
||
Provides hints to the fuzzing engine about inputs that might trigger unique code paths. Each fuzz target may have a | ||
corresponding `.dict` file. For information about dictionary syntax, refer to | ||
the [LibFuzzer documentation on the subject](https://llvm.org/docs/LibFuzzer.html#dictionaries). | ||
|
||
**Things to Know**: | ||
|
||
- OSS-Fuzz loads dictionary files per fuzz target if one exists with the same name, all others are ignored. | ||
- Most entries in the dictionary files found here are escaped byte values that were recommended by the fuzzing | ||
engine after previous runs. | ||
- A default set of dictionary entries are created for all fuzz targets as part of the build process, regardless of an | ||
existing file here. | ||
- Development or updates to dictionaries should reflect the varied formats and edge cases relevant to the | ||
functionalities under test. | ||
- Example dictionaries (some of which are used to build the default dictionaries mentioned above) can be found here: | ||
- [AFL++ dictionary repository](https://github.com/AFLplusplus/AFLplusplus/tree/stable/dictionaries#readme) | ||
- [Google/fuzzing dictionary repository](https://github.com/google/fuzzing/tree/master/dictionaries) | ||
|
||
### OSS-Fuzz Scripts (`oss-fuzz-scripts/`) | ||
|
||
Includes scripts for building and integrating fuzz targets with OSS-Fuzz: | ||
|
||
- **`container-environment-bootstrap.sh`** - Sets up the execution environment. It is responsible for fetching default | ||
dictionary entries and ensuring all required build dependencies are installed and up-to-date. | ||
- **`build.sh`** - Executed within the Docker container, this script builds fuzz targets with necessary instrumentation | ||
and prepares seed corpora and dictionaries for use. | ||
|
||
**Where to learn more:** | ||
|
||
- [OSS-Fuzz documentation on the build.sh](https://google.github.io/oss-fuzz/getting-started/new-project-guide/#buildsh) | ||
- [See Dulwich's build.sh and Dockerfile in the OSS-Fuzz repository](https://github.com/google/oss-fuzz/tree/master/projects/dulwich) | ||
|
||
## Running Fuzzers Locally | ||
|
||
This approach uses Docker images provided by OSS-Fuzz for building and running fuzz tests locally. It offers | ||
comprehensive features but requires a local clone of the OSS-Fuzz repository and sufficient disk space for Docker | ||
containers. | ||
|
||
### Build the Execution Environment | ||
|
||
Clone the OSS-Fuzz repository and prepare the Docker environment: | ||
|
||
```shell | ||
git clone --depth 1 https://github.com/google/oss-fuzz.git oss-fuzz | ||
cd oss-fuzz | ||
python infra/helper.py build_image dulwich | ||
python infra/helper.py build_fuzzers --sanitizer address dulwich | ||
``` | ||
|
||
> [!TIP] | ||
> The `build_fuzzers` command above accepts a local file path pointing to your Dulwich repository clone as the last | ||
> argument. | ||
> This makes it easy to build fuzz targets you are developing locally in this repository without changing anything in | ||
> the OSS-Fuzz repo! | ||
> For example, if you have cloned this repository (or a fork of it) into: `~/code/dulwich` | ||
> Then running this command would build new or modified fuzz targets using the `~/code/dulwich/fuzzing/fuzz-targets` | ||
> directory: | ||
> ```shell | ||
> python infra/helper.py build_fuzzers --sanitizer address dulwich ~/code/dulwich | ||
> ``` | ||
Verify the build of your fuzzers with the optional `check_build` command: | ||
```shell | ||
python infra/helper.py check_build dulwich | ||
``` | ||
### Run a Fuzz Target | ||
|
||
Setting an environment variable for the fuzz target argument of the execution command makes it easier to quickly select | ||
a different target between runs: | ||
|
||
```shell | ||
# specify the fuzz target without the .py extension: | ||
export FUZZ_TARGET=fuzz_configfile | ||
``` | ||
|
||
Execute the desired fuzz target: | ||
|
||
```shell | ||
python infra/helper.py run_fuzzer dulwich $FUZZ_TARGET -- -max_total_time=60 -print_final_stats=1 | ||
``` | ||
|
||
> [!TIP] | ||
> In the example above, the "`-- -max_total_time=60 -print_final_stats=1`" portion of the command is optional but quite | ||
> useful. | ||
> | ||
> Every argument provided after "`--`" in the above command is passed to the fuzzing engine directly. In this case: | ||
> - `-max_total_time=60` tells the LibFuzzer to stop execution after 60 seconds have elapsed. | ||
> - `-print_final_stats=1` tells the LibFuzzer to print a summary of useful metrics about the target run upon | ||
completion. | ||
> | ||
> But almost any [LibFuzzer option listed in the documentation](https://llvm.org/docs/LibFuzzer.html#options) should | ||
> work as well. | ||
#### Next Steps | ||
|
||
For detailed instructions on advanced features like reproducing OSS-Fuzz issues or using the Fuzz Introspector, refer | ||
to [the official OSS-Fuzz documentation][oss-fuzz-docs]. | ||
|
||
|
||
|
||
[oss-fuzz-repo]: https://github.com/google/oss-fuzz | ||
|
||
[oss-fuzz-docs]: https://google.github.io/oss-fuzz | ||
|
||
[oss-fuzz-issue-tracker]: https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:dulwich | ||
|
||
[google-fuzzing-repo]: https://github.com/google/fuzzing | ||
|
||
[atheris-repo]: https://github.com/google/atheris |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
"\\357\\273\\277" | ||
"\\\\\\015\\012" | ||
"\\001\\000" | ||
"\\000\\000\\000\\000" | ||
"\\001\\000\\000\\000" | ||
"\\377h" | ||
"-\\000\\000\\000\\000\\000\\000\\000" | ||
"[\\000\\000\\000\\000\\000\\000\\000" | ||
"H]\\000" | ||
"2\\000\\000\\000\\000\\000\\000\\000" | ||
"\\377\\377\\377\\377\\377\\377\\377;" | ||
"]\\377" | ||
"\\000\\000\\000\\000\\000\\000\\000B" | ||
"\\\\\\012" | ||
"\\000\\000\\000\\000\\000\\000\\0001" | ||
"rue" | ||
"b\\271\\"" | ||
"\\000\\000\\000\\000\\000\\000\\000]" | ||
"\\\\\\000\\000\\000\\000\\000\\000\\000" | ||
"\\330\\330 | ||
"\\000\\000\\000\\000\\000\\000\\000\\000" | ||
"\\377\\377\\377\\377" | ||
"%\\000\\000\\000\\000\\000\\000\\000" | ||
"\\000\\000\\000\\000\\000\\000\\000\\\\" | ||
"\\377\\377\\377\\377\\377\\377\\377$" | ||
"[\\000\\000\\000\\000\\000\\000\\000" | ||
"p\\012" | ||
"\\001\\000\\000\\000\\000\\000\\000\\"" | ||
"\\337\\000\\000\\000\\000\\000\\000\\000" | ||
"\\001\\000\\000\\000\\000\\000\\000\\000" | ||
"\\\\0=" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import atheris | ||
import sys | ||
from io import BytesIO | ||
|
||
with atheris.instrument_imports(): | ||
from dulwich.config import ConfigFile | ||
|
||
|
||
def is_expected_error(error_list, error_msg): | ||
for error in error_list: | ||
if error in error_msg: | ||
return True | ||
return False | ||
|
||
|
||
def TestOneInput(data): | ||
try: | ||
ConfigFile.from_file(BytesIO(data)) | ||
except ValueError as e: | ||
expected_errors = [ | ||
"without section", | ||
"invalid variable name", | ||
"expected trailing ]", | ||
"invalid section name", | ||
"Invalid subsection", | ||
"escape character", | ||
"missing end quote", | ||
] | ||
if is_expected_error(expected_errors, str(e)): | ||
return -1 | ||
else: | ||
raise e | ||
|
||
|
||
def main(): | ||
atheris.Setup(sys.argv, TestOneInput) | ||
atheris.Fuzz() | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# shellcheck shell=bash | ||
|
||
set -euo pipefail | ||
|
||
python3 -m pip install . | ||
|
||
# Directory to look in for dictionaries, options files, and seed corpora: | ||
SEED_DATA_DIR="$SRC/seed_data" | ||
|
||
find "$SEED_DATA_DIR" \( -name '*_seed_corpus.zip' -o -name '*.options' -o -name '*.dict' \) \ | ||
! \( -name '__base.*' \) -exec printf 'Copying: %s\n' {} \; \ | ||
-exec chmod a-x {} \; \ | ||
-exec cp {} "$OUT" \; | ||
|
||
# Build fuzzers in $OUT. | ||
find "$SRC/dulwich/fuzzing" -name 'fuzz_*.py' -print0 | while IFS= read -r -d '' fuzz_harness; do | ||
compile_python_fuzzer "$fuzz_harness" | ||
|
||
common_base_dictionary_filename="$SEED_DATA_DIR/__base.dict" | ||
if [[ -r "$common_base_dictionary_filename" ]]; then | ||
# Strip the `.py` extension from the filename and replace it with `.dict`. | ||
fuzz_harness_dictionary_filename="$(basename "$fuzz_harness" .py).dict" | ||
output_file="$OUT/$fuzz_harness_dictionary_filename" | ||
|
||
printf 'Appending %s to %s\n' "$common_base_dictionary_filename" "$output_file" | ||
if [[ -s "$output_file" ]]; then | ||
# If a dictionary file for this fuzzer already exists and is not empty, | ||
# we append a new line to the end of it before appending any new entries. | ||
# | ||
# LibFuzzer will happily ignore multiple empty lines in a dictionary but fail with an error | ||
# if any single line has incorrect syntax (e.g., if we accidentally add two entries to the same line.) | ||
# See docs for valid syntax: https://llvm.org/docs/LibFuzzer.html#id32 | ||
echo >>"$output_file" | ||
fi | ||
cat "$common_base_dictionary_filename" >>"$output_file" | ||
fi | ||
done |
55 changes: 55 additions & 0 deletions
55
fuzzing/oss-fuzz-scripts/container-environment-bootstrap.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
################# | ||
# Prerequisites # | ||
################# | ||
|
||
for cmd in python3 git wget rsync; do | ||
command -v "$cmd" >/dev/null 2>&1 || { | ||
printf '[%s] Required command %s not found, exiting.\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$cmd" >&2 | ||
exit 1 | ||
} | ||
done | ||
|
||
SEED_DATA_DIR="$SRC/seed_data" | ||
mkdir -p "$SEED_DATA_DIR" | ||
|
||
############# | ||
# Functions # | ||
############# | ||
|
||
download_and_concatenate_common_dictionaries() { | ||
# Assign the first argument as the target file where all contents will be concatenated | ||
target_file="$1" | ||
|
||
# Shift the arguments so the first argument (target_file path) is removed | ||
# and only URLs are left for the loop below. | ||
shift | ||
|
||
for url in "$@"; do | ||
wget -qO- "$url" >>"$target_file" | ||
# Ensure there's a newline between each file's content | ||
echo >>"$target_file" | ||
done | ||
} | ||
|
||
fetch_seed_data() { | ||
rsync -avc "$SRC/dulwich/fuzzing/dictionaries/" "$SEED_DATA_DIR/" | ||
} | ||
|
||
######################## | ||
# Main execution logic # | ||
######################## | ||
|
||
fetch_seed_data | ||
|
||
download_and_concatenate_common_dictionaries "$SEED_DATA_DIR/__base.dict" \ | ||
"https://raw.githubusercontent.com/google/fuzzing/master/dictionaries/utf8.dict" \ | ||
"https://raw.githubusercontent.com/google/fuzzing/master/dictionaries/url.dict" | ||
|
||
# The OSS-Fuzz base image has outdated dependencies by default so we upgrade them below. | ||
python3 -m pip install --upgrade pip | ||
# Upgrade to the latest versions known to work at the time the below changes were introduced: | ||
python3 -m pip install 'setuptools~=69.0' 'pyinstaller~=6.0' |