Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting Honggfuzz with a huge corpus #402

Open
mvanotti opened this issue Jun 3, 2021 · 2 comments
Open

Restarting Honggfuzz with a huge corpus #402

mvanotti opened this issue Jun 3, 2021 · 2 comments

Comments

@mvanotti
Copy link
Contributor

mvanotti commented Jun 3, 2021

I left honggfuzz running overnight, and it found a few crashes (5 different crashes), but my corpus directory ended up with 73k files in the corpus directory (>500MiB).

Now, I fixed all the crashes, and wanted to resume the fuzzing, but it has to run in dry run mode until it processes the entire corpus. Running the minimizer doesn't help because it runs single-threaded, it would take around a day to process the 72k files on my computer (it is going to take ~2 hours to finish the dry run mode with 6 threads).

One idea would be to fix issue #401 , so that minimization can be performed in a reasonable time.

Some other options could be running minimization during the fuzzing process so we don't end up with corpus that is too big? Or allowing the fuzzer to start fuzzing without finishing the dry run... after it has a couple thousand files, it surely has enough info to start fuzzing.

@robertswiecki
Copy link
Collaborator

One solution would be to run it as

honggfuzz --input input_corpus --output output_corpus -- bin

When it switches from state 1/3 and 2/3 to 3/3 - then in output_corpus you'll have somewhat minimized corpus, and all of that has been done in parallel.

It's not as good as -M, because samples are not sorted for minimization, but it might work for your use-case

@mvanotti
Copy link
Contributor Author

mvanotti commented Jun 7, 2021

Hi @robertswiecki thanks for your answer.

What are states 1/3, 2/3 and 3/3 ? Is this how the honggfuzz is commonly used? I feel like there's something that could be improved here, as every time I restart the fuzzer, it now takes a couple of hours to start properly fuzzing, as the corpus size keeps increasing.

At some point, maybe it makes sense to start fuzzing and mutating before processing all the input corpus? And maybe minimizing it at the same time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants