Replies: 1 comment 1 reply
-
Hi @tnmquann Thanks! I think it's not possible to have integration with sourmash unfortunately. While the approaches are similar, I think the way data is stored is very different. As far as 31 vs 51, I'm a bit hesitant using 51 because I haven't tested it's implications on profiling. I think future work on maybe using 51 for more stringent profiling is possible, but I'm haven't seen a compelling evidence yet. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @bluenote-1577
First off, congratulations on the publication of sylph in Nature :) Your tool is helping me a lot in my current daily tasks.
I was wondering if there are any plans to extend sylph to reuse existing sourmash sketched databases—both the original and rocksdb (if possible) versions? The approaches of sylph and sourmash seem to align in terms of sketching genomic data, and it might be worth exploring if leveraging the existing sketched databases could help alleviate storage problems. For instance, the GTDB-R220 database (genomic representatives) lacks sufficient resolution for performing taxonomy profiling on my current dataset. My tests show that extending to the all genomes version resolves this issue. However, I’m facing storage limitations when trying to build a custom database for sylph, and reusing the existing sourmash database could provide a solution to this problem.
Additionally, would it be possible to customize the k-mers value, (e.g: using 51-mers instead of the default 31-mers)? In certain cases, i think a larger k-mer might provide more stringent matches and help reduce the chance of false positives.
Thank you again for this excellent tool and looking forward to your thoughts!
Beta Was this translation helpful? Give feedback.
All reactions