libass: add seed corpus #12687

tyler92 · 2024-11-04T19:37:12Z

I noticed that ass_parse_tags is not covered good enough. This PR adds a seed corpus from the libass-tests repository which improves coverage

github-actions · 2024-11-04T19:38:24Z

tyler92 is a new contributor to projects/libass. The PR must be approved by known contributors before it can be merged. The past contributors are: DonggeLiu, rcombs, astiob, devtty1er, Dor1s, inferno-chromium, ssbr (unverified), mikea (unverified)

tyler92 · 2024-11-06T22:19:27Z

@astiob What do you think about the change?

astiob · 2024-11-08T01:39:58Z

I’ll be honest: I don’t know fuzzing well enough to comment on this, so I asked around, and my impression is that people are concerned about (1) whether this doesn’t worsen the fuzzing workload due to repetitiveness and human-friendliness of the libass-tests samples and (2) why these tags don’t seem to be generated automatically, seeing as they’re present in the dictionary.

If you’re certain that this improves coverage and reasonably sure that it doesn’t have other negative effects, then I figure this is probably good? even if it raises additional questions.

TheOneric · 2024-11-08T22:11:36Z

Additionally even the samples in libass-tests unfortunately don't yet cover all tags — but the dictionary contains all tags (for more than two years already), so as mentioned above the poor coverage is surprising and somewhat worrying. Thanks for bringing attention to this!

I’m not sure why it didn’t yet manage to generate any samples for so many tags. In libass/libass#568 the CIFuzz action always early-exited due to timeouts making it useless, though at the time regular OSS Fuzz still seemed to work fine (producing new reports).
Rechecking this now, timeouts still occur within the first few minutes of fuzzing for address and undefined exits even early due to signed-integer overflows which used to be and should be ignored for libass (at least atm). Does one of those issues now also bog down OSS Fuzz or is something else going on?

tyler92 · 2024-11-09T20:24:09Z

You're right - the primary root cause of the low coverage is the fuzzer's inefficiency. I hope that, once it improves, those tags will be covered even without an additional seed corpus.

Regarding the seed corpus, it’s fine to include human-friendly samples, as the fuzzer can minimize the corpus by finding a smaller set that achieves the same coverage. One advantage of having such a seed corpus is that you can add samples that triggered crashes but weren’t discovered by the fuzzer for some reason.

Given that, I’m undecided on whether to close this PR

TheOneric · 2024-11-11T22:13:09Z

Regarding the seed corpus, it’s fine to include human-friendly samples, as the fuzzer can minimize the corpus by finding a smaller set that achieves the same coverage.

If the fuzzer actually discards samples not improving coverages during load this might work out, though perhaps it might be better to be more selective about the included files, some rely on special fonts etc. Without having tested it, i’m guessing the following may be the most useful:

crash/generic.ass
regression/draw-images/*.ass
regression/karaoke/*.ass
regression/scaling_playres/*.ass
regression/ssa/*.ass
regression/v4++/*.ass

While e.g. font_nonunicode seems unhelpful for fuzzing and I’m unsure about regression/embedded_font/efont.ass.

If a more selective set works out or you’re sure the fuzzer won't waste any time on spurious samples other than a neglible bit during initial corpus loading, it probably makes sense to do this until we figured out what’s wrong with the main fuzzer.

libass: add seed corpus

3f9822e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libass: add seed corpus #12687

libass: add seed corpus #12687

tyler92 commented Nov 4, 2024

github-actions bot commented Nov 4, 2024

tyler92 commented Nov 6, 2024

astiob commented Nov 8, 2024 •

edited

Loading

TheOneric commented Nov 8, 2024

tyler92 commented Nov 9, 2024

TheOneric commented Nov 11, 2024

libass: add seed corpus #12687

Are you sure you want to change the base?

libass: add seed corpus #12687

Conversation

tyler92 commented Nov 4, 2024

github-actions bot commented Nov 4, 2024

tyler92 commented Nov 6, 2024

astiob commented Nov 8, 2024 • edited Loading

TheOneric commented Nov 8, 2024

tyler92 commented Nov 9, 2024

TheOneric commented Nov 11, 2024

astiob commented Nov 8, 2024 •

edited

Loading