Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for PureCN #1710

Draft
wants to merge 19 commits into
base: dev
Choose a base branch
from
Draft

Support for PureCN #1710

wants to merge 19 commits into from

Conversation

lbeltrame
Copy link

@lbeltrame lbeltrame commented Oct 29, 2024

This PR implements support for PureCN (https://github.com/lima1/).

Design rationale talked at the hackathon:

  • Assume a PoN has been made already ("NormalDB", in PureCN-speak)
  • Don't calculate coverage from PureCN, use GATK4, which supports CRAM input (however denoising requires a PoN, so this makes things a little more complicated)
  • Use an interval file already processed by PureCN (done in Sarek first)

CI is failing, but in areas I didn't touch (hopefully!)

TODO:

  • Tests
  • Docs
  • Actually testing this with real data

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@lbeltrame lbeltrame self-assigned this Oct 29, 2024
Copy link

github-actions bot commented Oct 29, 2024

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 82d477e

+| ✅ 215 tests passed       |+
#| ❔  11 tests were ignored |#
!| ❗   4 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-10-30 10:08:43

@lbeltrame
Copy link
Author

lbeltrame commented Oct 29, 2024

@FriederikeHanssen @maxulysse

I didn't hook the VCF part yet, because PureCN depends on MuTect2, but it also requires specific parameters:

Make sure to run Mutect 2 with --genotype-germline-sites true --genotype-pon-sites true. You will not get usuable output without those flags. Since Mutect 2 from GATK 4.2.0+, average base quality scores can be very low and variants will be too aggressively removed by PureCN. You will need to set --min-base-quality 20 in PureCN.R to keep them.

(https://www.bioconductor.org/packages/devel/bioc/vignettes/PureCN/inst/doc/Quick.html#3_Create_VCF_files)

What can we do in this case? Discussed in person, assume that:

a. MuTect2 is being run;
b. MuTect2 was run with the right parameters (how to warn if not?)

@lbeltrame
Copy link
Author

lbeltrame commented Oct 29, 2024

Also lint fails, what's the recommended way to patch those modules to make sure nf-core lint is happy? Should be fixed with patches.

This needs to be upstreamed pronto!
As discussed as the hackathon, we can reasonably assume PureCN is
run in the recommended way.
As long as Mutect has been run with a PoN, PureCN can work with it.
@lbeltrame
Copy link
Author

From the implementation side, at least conceptually, everything should be in place. I can't test this here at the hackathon, so this will have to wait until I'm back to use some real-world tests.

@lbeltrame lbeltrame changed the title Support for PureCN (do not merge, early draft) Support for PureCN Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant