Let's add fuzz testing! #1849

fuzzah · 2023-07-11T07:46:54Z

Hello!

I use cowrie and would like to ensure and improve its security.
Not being familiar with Twisted I tried implementing fuzz testing using the atheris fuzzer, but failed miserably :(

Here's my attempt: https://gist.github.com/fuzzah/9a91fd0b4725779fbcaae13edbb22596
This code leaks memory, so fuzzing stops after few minutes having taken 2 gigs of RAM. I tried to use tracemalloc and gc modules to no avail. The memory allocations happen somewhere in Twisted.

I suppose the main problem is that I create objects manually and not creating something responsible for object deletion.

Describe the solution you'd like
Please help me implement fuzzing for cowrie :)
I suppose we'd need to test both protocol and all available commands. In the gist above both things should be tested in theory, but more direct approach for commands would be better.

Describe alternatives you've considered
I tried fuzzing cowrie with radamsa. It works, but results are just not fruity at all, as radamsa isn't coverage-based (doesn't understand what code paths are taken in tested app). This means it fails somewhere in the early stages of server-client interaction.

Additional context
Yes, I realize cowrie is a honeypot software "intended" to be hacked, but I don't want it to crash with unhandled exceptions. Also there are class pollution attacks in Python allowing RCE. Of course hackers are contained somewhere in docker, but we should still limit their actions, as docker also had container escape vulnerabilities in the past.
You get my point: we should still protect the software, so let's add some fuzzing!

fuzzah · 2023-07-11T08:07:30Z

Here's my fuzzing session which I have stopped after approximately 1 minute:

Note the memory growth (rss stats).
Below are the most memory-allocating source lines.

This line is on the top: site-packages/twisted/protocols/policies.py:662
It just has a definition of TimeoutMixin.setTimeout.

micheloosterhof · 2023-07-12T14:18:57Z

So the TimeoutMixin keeps hanging around because calling loseConnection() does not remove the timeout and the callLater() keeps hanging around. Removing the TimeoutMixin removes the issue, so it's definitely got to do with this.
It seems similar to twisted/twisted#2447

I'm building a PR to remediate this #1851

After applying this, the TimeoutMixin disappears, but other objects remain, so there's more going on.

micheloosterhof · 2023-07-15T10:38:06Z

I think you are running into this issue: https://stackoverflow.com/questions/65321138/twisted-unittest-reactor-unclean-when-using-timeoutmixin

micheloosterhof · 2023-07-16T03:12:49Z

The quick way to do your fuzzing (but not the greatest) is to remove the TimeoutMixin. This'll give you the same results, I don't expect the fuzzer to interact with the timeout code. You won't have issues then.
Or change your fuzzer code so the disconnect is handled correctly. The disconnect should remove the DelayedCall that's open for the timeout checker.

fuzzah · 2023-07-16T10:12:39Z

@micheloosterhof , thank you for your replies :)
I'll come back after trying again (in a day or two)

fuzzah · 2023-07-18T16:02:05Z

So I've tried to use StringTransportWithDisconnection instead of simple StringTransport.
The memory still leaks, but much slower now: got an hour of fuzzing!

What also worries me is the number of code paths atheris could discover: just 41 for an hour of testing. I find it way too little for SSH protocol and cmdlines of different emulated programs.
I've updated the gist to include new code, dictionary, and atheris cmdline.

micheloosterhof · 2023-10-19T13:43:04Z

Any results of your fuzzing or shall I close this issue?

fuzzah · 2023-10-21T23:14:56Z

@micheloosterhof, fuzzing should be added to the project's code base and run regularly, otherwise it doesn't make much sense in the long run. So no "results" yet.
As of my research:

Something is clearly wrong with my gist, as only 41 code paths were detected by atheris, which is far too few for such a big program, and the memory is still leaking slowly, so the fuzzer quits with OOM after ~1 hour of testing.
No bugs were found in 3 fuzzing runs of ~1 hour each. This doesn't mean that there are no bugs in cowrie. Good fuzzing should take a long time (see what they do in the OSS-Fuzz project).
I wasn't able to collect coverage with fuzzed test samples, so I can't even confirm whether these 41 code paths were found in cowrie's code base or not. I just don't know how to do it right with twisted running cowrie (what would be the cmdline of the coverage tool?).

So if you're interested in making the application more secure, it would be great if you could help with the fuzzing implementation instead of closing the issue :) I'm sure you as the developer would do it much better than me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let's add fuzz testing! #1849

Let's add fuzz testing! #1849

fuzzah commented Jul 11, 2023

fuzzah commented Jul 11, 2023

micheloosterhof commented Jul 12, 2023 •

edited

micheloosterhof commented Jul 15, 2023

micheloosterhof commented Jul 16, 2023 •

edited

fuzzah commented Jul 16, 2023

fuzzah commented Jul 18, 2023

micheloosterhof commented Oct 19, 2023

fuzzah commented Oct 21, 2023 •

edited

Let's add fuzz testing! #1849

Let's add fuzz testing! #1849

Comments

fuzzah commented Jul 11, 2023

fuzzah commented Jul 11, 2023

micheloosterhof commented Jul 12, 2023 • edited

micheloosterhof commented Jul 15, 2023

micheloosterhof commented Jul 16, 2023 • edited

fuzzah commented Jul 16, 2023

fuzzah commented Jul 18, 2023

micheloosterhof commented Oct 19, 2023

fuzzah commented Oct 21, 2023 • edited

micheloosterhof commented Jul 12, 2023 •

edited

micheloosterhof commented Jul 16, 2023 •

edited

fuzzah commented Oct 21, 2023 •

edited