Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update RNA_calling.ipynb #478

Closed
wants to merge 1 commit into from
Closed

Update RNA_calling.ipynb #478

wants to merge 1 commit into from

Conversation

gouwh29
Copy link
Contributor

@gouwh29 gouwh29 commented Dec 9, 2022

Fix #475 in correct folder

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@gaow gaow requested a review from hsun3163 December 9, 2022 16:57
@hsun3163
Copy link
Collaborator

hsun3163 commented Dec 9, 2022

Hi @Gou-29, thanks for adding more to the pipeline. I wonder, is there a reason u want to keep the STARtmp files? Based on the STAR author these are to be removed and also the STAR pipeline from TensorQTL remove them as well.

"    rm -r ${_output[0]:nnnn}._STARtmp\n",

@gouwh29
Copy link
Contributor Author

gouwh29 commented Dec 9, 2022

Hi @hsun3163 : in my running on single-ended data, there does not have a _STARtmp file so I removed that line. I do not know whether in PE data there will be one

@hsun3163
Copy link
Collaborator

hsun3163 commented Dec 9, 2022

Hi @hsun3163 : in my running on single-ended data, there does not have a _STARtmp file so I removed that line. I do not know whether in PE data there will be one

Per my understanding this is to rm the cache file in case STAR fail to do so themselves, could u push a new commit without it being deleted?

@gouwh29
Copy link
Contributor Author

gouwh29 commented Dec 9, 2022

Hi @hsun3163 : in my running on single-ended data, there does not have a _STARtmp file so I removed that line. I do not know whether in PE data there will be one

Per my understanding this is to rm the cache file in case STAR fail to do so themselves, could u push a new commit without it being deleted?

Actually in my running, if I do not remove that line the code will return an error.

@gouwh29
Copy link
Contributor Author

gouwh29 commented Dec 9, 2022

@hsun3163: I just reproduced the error message:

in .log file:

rm: cannot remove 'xxx/20Pre._STARtmp': No such file or directory`

in console:

ERROR: STAR_align_1 (id=65bd420a35f37e23) returns an error.
ERROR: [STAR_align_1]: [0]: Executing script in docker returns an error (exitcode=1, stdout=xxx/20Pre.Aligned.sortedByCoord.out.stdout).
The script has been saved to xxx.sh. To reproduce the error please run:
``docker run --rm -xxxx
[STAR_output]: Exits with 4 pending steps (STAR_output, strand_detected_1, strand_detected_2, picard_qc)

@hsun3163
Copy link
Collaborator

hsun3163 commented Dec 9, 2022

@Gou-29

Hmmm it is strange, is the tmp file ever generated on your case?

@gouwh29
Copy link
Contributor Author

gouwh29 commented Dec 9, 2022

@Gou-29

Hmmm it is strange, is the tmp file ever generated on your case?

Yes, when the pipeline is initially running, there will be three files end with _STARgenome, _STARpass1, _STARtmp. As my previous running exclude the line remove rm -r xxx_STARtmp, maybe this file have been remove by STAR itself.

@hsun3163
Copy link
Collaborator

@Gou-29 I have added various accommodations of the SE in my new PR, incorporating some changes you suggested. I will close this PR now. can you try to pull once my new pr is merged and see if that works? particularly the trim adaptor part

@hsun3163 hsun3163 closed this Dec 10, 2022
@gouwh29
Copy link
Contributor Author

gouwh29 commented Dec 13, 2022

@hsun3163: Thanks a lot for your updated pipeline! I have tested most parts of it can still, but there are two 3 issues:

  • In fastqc: I notice that there is an option that the input fastq can be compressed or uncompressed. If my input is xxx.fq.gz, the output will be xxx.fq_fastq.html, and the pipeline failed to capture this. This small bug will not trigger the end of pipeline and will just throw out an error.
  • In trimmomatic_trim_adaptor: since the PE data and CE data have different java configurations, the first line of the actual code should be java -jar -Xmx${java_mem} ${trimmomatic_jar} ${"PE" if is_paired_end else "SE"} -threads ${numThreads} \
  • In STAR part: still, I need to exclude this line rm -r xxx_STARtmp to circumvent the error message (which may lead to a direct stop of the pipeline)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug fix in RNA calling pipeline
2 participants