Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding inconsistent coverage reports #11935

Open
renatahodovan opened this issue May 9, 2024 · 3 comments
Open

Understanding inconsistent coverage reports #11935

renatahodovan opened this issue May 9, 2024 · 3 comments

Comments

@renatahodovan
Copy link
Contributor

I'm attempting to comprehend why the project, which I aim to enhance, yields quite inconsistent coverage results. Upon examining the coverage reports, it seems that in case of low coverage, one or two of the three possible targets are called only a few times (less than the number of seeds in the initial corpus; see example here: although the corpus contains 21 elements but LLVMFuzzerTestOneInput was executed only 12x). I suspect that some flaky timeout or OOM occurs during corpus loading, causing the fuzz target to terminate prematurely. Unfortunately, I cannot validate this locally with the helper script. Therefore, I'm interested in whether it's feasible to access the fuzzer logs associated with the public coverage reports, or at least one of the logs accompanying the low coverage report. Alternatively, if someone could provide guidance on how to reproduce the CI setup using the infra/helper.py script (or run the CI itself locally), including timeout, RSS, max_total_time, environment variables, etc., that would be greatly appreciated.

@DavidKorczynski
Copy link
Collaborator

I could imagine this is likely due to some statefulness in the target? See here for an issue that looks into a similar issue: #9928

In short, the harness should ideally execute the same set of code independently of what order the corpus is run against the target, however, I would suspect in this case there is some statefulness that means the order impacts what is being executed. This is kind of similar to your suspicion regarding timeouts or OOMs.

This page may help re coverage logs: https://oss-fuzz-build-logs.storage.googleapis.com/index.html#quickjs

@renatahodovan
Copy link
Contributor Author

@DavidKorczynski Thanks for your reply! There was a statefulness issue in the targets indeed which I fixed just yesterday. I thought that it was responsible only for the irreproducible issues and not for the coverage inconsistencies. However, if this is the case, then the coverage should stabilise as soon as the new code starts running in the next days.

This page may help re coverage logs: https://oss-fuzz-build-logs.storage.googleapis.com/index.html#quickjs

I knew this build log page, but it doesn't contain information about the execution parameters of the targets. Could you validate that the coverage results are generated after 10 minutes of fuzzing with 25seconds of timeout (I saw similar constants somewhere, but I wasn't sure that these are actually used to generate the corpus for coverage measurement)?

@DavidKorczynski
Copy link
Collaborator

Yeah, if there is statefulness then this will impact coverage collection, I'm quite that's the root cause of this issue. It will also have affected, e.g. corpus pruning which means that the corpus may have jumped a bit in size sporadically. Assuming statefulness have been resolved then I think you should start seeing more stable/reliable patterns on the coverage graph.

Regarding coverage collection, this is the specific line used for running the actual coverage extraction:

timeout $TIMEOUT $OUT/$target $args &> $LOGS_DIR/$target.log

Timeout per target is set to 100sec:

local args="-merge=1 -timeout=100 $corpus_dummy $corpus_real"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants