Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trickest/cve as a data source for PoC #1396

Open
kotakanbe opened this issue Feb 17, 2022 · 5 comments
Open

Add trickest/cve as a data source for PoC #1396

kotakanbe opened this issue Feb 17, 2022 · 5 comments

Comments

@kotakanbe
Copy link
Member

https://github.com/trickest/cve

@jbmaillet
Copy link

That looks interesting. Do you plan to do it in Vuls itself, or add it to CVE data in go-cve-dictionary? (Currently, I use only go-cve-dictionary from the Vuls.io stack: I work on IoT / embedded, not regular IT, so most parts of Vuls are not actionable in my field.)

On the other hand, I may add it to my client application, as I did for:

But while these catalogs are very small for now, less then a megabyte, trickest/cve on the other hand is already about 170MB, so that would be trickiest (no pun intended) for me to grab on-the-fly while scanning my products.

Note: I also use the CVE reference tags from go-cve-dictionaries, for "Exploit" (and "Mitigation"), so I would expect quite some duplicate with trickest. I have now idea how many in my use cases, could be an interesting experiment.

@kotakanbe
Copy link
Member Author

kotakanbe commented Feb 17, 2022

Hi, @jbmaillet

Thanks for the info, I didn't know it was 170MB.
I'm wondering how to integrate it because it's so huge.
I'll think about it in the future, but it should be one of the existing go-*-dictionary series or a new one, not the vuls itself.

Of the list you gave me
@MaineK00n already implemented CISA's catalog at https://github.com/vulsio/go-kev.

I didn't know about InTheWild.io, but it seems to be useful information.
Is this information reliable?

In Vuls, the PoC information is as follows, in order of reliability

  1. Metasploit (https://github.com/vulsio/msfdb-list)
  2. ExploitDB and the reference URI with the NVD Explotit tag ( https://github.com/vulsio/go-exploitdb )
  3. qazbnm456/awesome-cve-poc and, nomi-sec/PoC-in-GitHub

What do you think about the trustworthiness of InTheWild.io and the trustworthiness of trickest/cve's information? 
Where would you place it in 1, 2, or 3?

@jbmaillet
Copy link

jbmaillet commented Feb 17, 2022

Thanks for the info, I didn't know it was 170MB.
I'm wondering how to integrate it because it's so huge.

As I see it for now:

  • it must be a server of its own, like other Vuls bricks,
  • or its information must be integrated in an already existing server, such as go-cve-dictionary (but go-cve-dictionary has constraints and priorities of its own, the CVE JSON scheme release candidate 6 has been published yesterday, and this is more important than adding new bits in a rush).
  • But it cannot be something you grab from scratch at each new scan operation, as did the ugly and deprecated OWASP dependency track and others did with the NVD in the past.

Furthermore:

What is a bit of a pity is that we can only get the data themselves with their full git history, and not the precise algo/heuristics they use to inventory the PoC. I have a similar problem with Linux Kernel CVEs, also here on github: we have the data, but not the code, and it's been now 2 years or so they said they would release the code, but don't.

Note that I do not blame anyone for such situation: I for myself make a living with proprietary code, and have my own bits of secret sauce in the tools I develop. :-/

I didn't know about InTheWild.io, but it seems to be useful information.
Is this information reliable?

I don't now for sure, but this is an official initiative from Google / Microsoft / Apple (+ crowdsourcing), so for now I am inclined to trust it, these people have lot of resources and are smart (at least smarted than me). Also, note that each record has one or more "trust rating", for example, as I write this, CVE-2022-22620 has a single report with "high" confidence, while CVE-2021-45461 has 2, others have 3. I could not find examples of low/mid confidence. For now, this catalog does not have a single CPE I use, nor CVE I detect, which is not a surprise: IoT is a bit apart from the IT crowd. But my development cost was about 3-4 hours, so it's fine.

(Side note: it is striking that the CISA catalog and InTheWild.io, a public sector initiative and a private sector one, started at the same time in last November. There is truly a big momentum from all parts on cybersec, which is great.)

I have already implemented CISA's catalog at https://github.com/vulsio/go-kev.

Yes, I discussed that with your colleague @MaineK00n at the time. The catalog is so small and simple, that it was not a problem to make it built-in in my scanner rather than deploy a go-kev instance.

reliability of Metasploit vs ExploitDB vs qazbnm456/awesome-cve-poc vs nomi-sec/PoC-in-GitHub etc

I have no idea. In my experience (IoT, not IT), these are not really useful, because they do not match the CPE/package in our ecosystem. And anyway even with PoC, we would not have the resources to put these to use, that is to say to make an automated test suite that would put it to use, like Metasploit / Kali etc does. Which leads me to my conclusion:

As I see it, this is all about prioritizing CVE fixing:

The higher the CVSS 3.1, the higer the number of reference to PoC or exploit, the higher the number of clues it is exploited in the wild, from the higher number of sources, than the highest priority.

In this regards, both the CISA catalog and InTheWild.io were super cheap to implement, even if not fruitful for now, so no question: I did it.

But trickest/cve, or nluedtke/linux_kernel_cves are different in this regard:

  • they are not cheap, at least because of their size,
  • and we only have the data, not the code, so it's not so easy to do our own re-implementation, in GoLang for you, in Python for me.

They both are "open source data", but not "open source code".

PS: I want to take this opportunity to thank you and @MaineK00n for the priceless work you do. You have been the #1 OSS project I depend on since 3 years now. I wish I could convince my company to sponsor you rather than buying 100% bullsh*t useless tools from startups with marketing departments bigger than their engineering.

@jbmaillet
Copy link

jbmaillet commented Feb 17, 2022

Last for today: what I think I would and will do is a quick and dirty prototype (max 1 day of devel + 1 day or run on my products uses cases) to get some actual figures of:

How many / what percentage of PoC do I get from trickest/cve, that I do not already have with the NVD references?

This is what matters in the end. At least for implementation priority.

@jbmaillet
Copy link

Another potential hard point I thought about trickest/cve: the data are in Markdown. While this is fine for human writing and reading, it has dialects, and does no have well defined structure. So, when injecting it in a real database, there may be some glitches, and a validation using another format might be needed:

  • XML? No thanks.
  • JSON? Why not.
  • YAML would be fine to me.

Perhaps this point should be addressed, or at least reported, directly upstream as a trickest/cve issue or discussion?

Another point: a kind of small META file, of META files for each year, similar to NVD feeds META, with a timestamp of last update could be useful too to avoid downloading the full 170MB if no changes where made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants