-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid indices in make_index_files.R assignment of non-HLA transcript sequences #7
Comments
Here's the warning (doesn't seem to cause a problem once I've substituted the command above):
|
I'm getting a similar error except it just exits the program:
|
Yah I think I was seeing execution die as well, with the "invalid indices" message. Have you had a chance to try my fix above? There's probably a better, (older) tidyverse way to do it, but ... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm working in a conda virtual environment, using R 4.0.5. dplyr gets updated to 1.0.9 when installing hlaseqlib, and gives a warning msg about "group_by" and a change as of dplyr 1.0.0 ... which may or may not be related to this issue (I'm not well versed in the tidyverse). When stepping through the make_index_files.R to troubleshoot, it's the assignment of transcript_no_hla that fails, with a message about invalid indices (sorry I don't have the exact text in front of me). I solved with classic R:
transcripts_no_hla <- transcripts[ which( names( transcripts ) %in% transcripts_db$transcript_id[ which( !( transcripts_db$gene_name %in% hladb_genes ) ) ] ) ]
... hopefully that gives you an idea of what the problem could be? I'm using a gencode v21 protein_coding trancripts fasta and annotation for main chromosomes. Before isolating the problematic step, I tried truncating the ENS[GT] id's to get rid of the ".#" (version) in both annotation and fasta, and removing all non-protein_coding annotations from the gtf file, but neither changed the error. So I think the problem is in some changed dplyr syntax, maybe, and the versions I'm using.
The text was updated successfully, but these errors were encountered: