Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: Why does restarting/recompiling often generate corpus? #257

Open
klauspost opened this issue Aug 19, 2019 · 5 comments
Open

Q: Why does restarting/recompiling often generate corpus? #257

klauspost opened this issue Aug 19, 2019 · 5 comments

Comments

@klauspost
Copy link

When running fuzz tests, I very often come across cases where it will run for hours without generating additional cases:

2019/08/19 09:18:10 workers: 11, corpus: 551 (9h53m ago), crashers: 0, restarts: 1/9859, execs: 12274551 (344/sec), cover: 1378, uptime: 9h55m

However, when restarting, it will generate new cases, usually within minutes:

2019/08/19 09:24:35 workers: 11, corpus: 551 (4m36s ago), crashers: 0, restarts: 1/1016, execs: 11183 (41/sec), cover: 1378, uptime: 4m36s
2019/08/19 09:24:38 workers: 11, corpus: 552 (1s ago), crashers: 0, restarts: 1/1025, execs: 11280 (40/sec), cover: 1378, uptime: 4m39s

To me (no expert) this seems like a missed opportunity, and I am wondering if it is possible to do this without having to manually restart the process?

@josharian
Copy link
Collaborator

I wish I knew! I’ve observed the same thing and spent some time trying to understand why. I don’t yet. My best theory is that it has to do with the reminimization that occurs on startup, but I don’t have concrete evidence for or an explanation of that.

@josharian
Copy link
Collaborator

Somewhat related...

I have some mutations implemented locally that are expensive but effective. I’ve been thinking for a while about monitoring coverage and if progress stalls for a while, switch into a “rattle-the-cage” mode where we do a bunch of expensive mutations to try to reach new territory. This is not an unfamiliar approach: I read a paper about folks using a SAT solver (way more expensive!) in such a way.

I also tried adding tracing so that I could tell what mutations introduced new coverage. (This is trickier than it sounds; the code base is not set up for it.) The idea was then to treat the problem as a multi-armed bandit, re-weighting the mutations over time to favor the effective ones. Then you can throw in anything. I never got it fully working, but my initial experiments were not positive. Although I did learn that of the two dozen mutations in use, a handful are responsible for the vast majority of the new coverage.

I’m happy to share some more details about my more promising experiments if someone is interested in doing some work to getting them integrated. Otherwise, I’ll eventually finish my slow process of cleaning up the worthwhile ones and upstreaming them incrementally.

@dvyukov
Copy link
Owner

dvyukov commented Aug 19, 2019

Yes, it's frequently quite hard to understand what exactly happens inside of a fuzzer and why. Minimization is a possible cause, which suggests that if we implement the "corpus rotation" idea in go-fuzz:
google/syzkaller#1348
it should fix this and improve fuzzer beyond that. I've seen a number of indirect and almost direct signals suggesting that such corpus rotation should be very profitable.

@dvyukov
Copy link
Owner

dvyukov commented Aug 19, 2019

Re multi-armed bandit, I think we need to start with some framework that allows to more explicitly distribute interations (time?) across multiple strategies and with statically assigned priorities. The second (optional) step would be to learn/re-learn the probabilities.
But if you ask me, I am always ready to give some moderate amount of interations to any new good strategy b/c behavior on infinite time is more important in the end (esp in the context of continuous fuzzing).

@dgryski
Copy link
Contributor

dgryski commented Aug 20, 2019

If you read the AFL mailing list, there are lots of patches /papers that improve AFL's coverage fuzzing in certain cases. It might be interesting to see which of those we would want to import into go-fuzz.

klauspost added a commit to klauspost/go-fuzz that referenced this issue Oct 13, 2019
Add flag for automatic shutdown after a specified duration.

It seems there can be advantages of restarting fuzzing at regular intervals as seen in dvyukov#257

Also it can be useful to run fuzz tests for a short period to check for regressions.

To facilitate this, add a `-duration` parameter. After the fuzzer has been running this duration it will do a regular exit.

This will always exit with code 0. It could make sense to exit with non-zero if there it one or more crashers.
klauspost added a commit to klauspost/go-fuzz that referenced this issue Oct 13, 2019
Add flag for automatic shutdown after a specified duration.

It seems there can be advantages of restarting fuzzing at regular intervals as seen in dvyukov#257

Also it can be useful to run fuzz tests for a short period to check for regressions.

To facilitate this, add a `-duration` parameter. After the fuzzer has been running this duration it will do a regular exit.

This will always exit with code 0. It could make sense to exit with non-zero if there it one or more crashers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants