Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WES files provided by ascat author #1526

Open
wlyucl opened this issue May 16, 2024 · 2 comments
Open

WES files provided by ascat author #1526

wlyucl opened this issue May 16, 2024 · 2 comments
Labels

Comments

@wlyucl
Copy link

wlyucl commented May 16, 2024

Description of the bug

Hi Developers,

I'm trying to run the Sarek implemented ASCAT for CNV analysis on WES data. On the nfcore Sarek website, it's suggested to follow 5 steps, as specified in this doc https://nf-co.re/sarek/3.4.0/docs/usage#how-to-generate-ascat-resources-for-exome-or-targeted-sequencing, to generate reference information (allele.zip, loci.zip, GC.zip, and RT.zip) for exome data instead of using the default igenome directly. I noticed that the ASCAT author had also provided ref files for WES at https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WES, which seemed to be a ready-to-use version when provided with an appropriate BED file. Would it be feasible to replace the default ignome ref with those for Sarek ASCAT?

I'm now running Sarek with params (-- --ascat_alleles, --ascat_loci, --ascat_loci_gc, --ascat_loci_rt) on the command line. The pipeline seems to work well. But, it would be great to hear advice from you.

Thank you!

Command used and terminal output

No response

Relevant files

No response

System information

No response

@wlyucl wlyucl added the bug Something isn't working label May 16, 2024
@maxulysse maxulysse added this to the 3.5 milestone May 16, 2024
@FriederikeHanssen
Copy link
Contributor

Hi! By default we supply the WGS files, but you should be able to fetch the files you want and supply them easily via the command line. THank you for flagging the updated files that are available, we can reflect this in our docs as well and link to it.

@FriederikeHanssen FriederikeHanssen added docs and removed bug Something isn't working labels Jul 8, 2024
@lauren-tjoeka
Copy link

lauren-tjoeka commented Jul 9, 2024

looking forward to the update! I'm following this documentation and in point 3 I think I've encountered a typo in the 'awk' command right after 'do':

cd battenberg_loci_on_target_hg38/
rm *chrstring*
rm 1kg.phase3.v5a_GRCh38nounref_loci_chr23.txt
for i in {1..22} X
do

awk '{ print $1 "\t" $2-1 "\t" $2 }' 1kg.phase3.v5a_GRCh38nounref_loci_chr${i}.txt > chr${i}.bed
#awk '{ print "chr" $1 "\t" $2-1 "\t" $2 }' 1kg.phase3.v5a_GRCh38nounref_loci_chr${i}.txt > chr${i}.bed

grep "^${i}_" GC_G1000_on_target_hg38.txt | awk '{ print "chr" $1 }' > chr${i}.txt
bedtools intersect -a chr${i}.bed -b targets_with_chr.bed | awk '{ print $1 "_" $3 }' > chr${i}_on_target.txt

n=wc -l chr${i}_on_target.txt | awk '{ print $1 }'

count=$((n * 3 / 10))
grep -xf chr${i}.txt chr${i}_on_target.txt > chr${i}.temp
shuf -n $count chr${i}_on_target.txt >> chr${i}.temp
sort -n -k2 -t '_' chr${i}.temp | uniq | awk 'BEGIN { FS="_" } ; { print $1 "\t" $2 }' > battenberg_loci_on_target_hg38_chr${i}.txt
done

zip battenberg_loci_on_target_hg38.zip battenberg_loci_on_target_hg38_chr*.txt

I could only get the for loop to run when I used the line I've commented out instead that contains "chr". Is this expected behaviour? I'm using hg19 references

Many thanks!

@FriederikeHanssen FriederikeHanssen removed this from the 3.5 milestone Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants