Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plots for the original FACS 1.0 dataset #106

Open
brainstorm opened this issue Jan 20, 2014 · 23 comments
Open

Add plots for the original FACS 1.0 dataset #106

brainstorm opened this issue Jan 20, 2014 · 23 comments
Assignees

Comments

@brainstorm
Copy link
Collaborator

We should get some metrics (runtime/accuracy) from the old dataset (the one used in the original paper) with the current version of FACS.

Preferrably in an automated/reproducible way. Manually if it is too cumbersome.

@tzcoolman, @henrikstranneheim, can you take care of that?

@tzcoolman
Copy link
Contributor

@brainstorm I thought I had done that before. Plus, fastq screen only takes datasets with fastq format while the old one is in fasta. FACS2.0 and deconseq has slightly runtime difference in handling fasta and fastq format file.

@brainstorm
Copy link
Collaborator Author

Are those plots relevant today given all the changes introduced to FACS since then?

Can they be regenerated/made by anyone else than you and/or Henrik? Where is the code to do so?

Thanks Enze!

@tzcoolman
Copy link
Contributor

It is reproducible of course, though I didn't use a script to run it. All I have is relevant data. (Same as what Henrik had done in the paper, only counting ecoli and human chr 8 and 22) @brainstorm

@tzcoolman
Copy link
Contributor

@arvestad @guillermo-carrasco @brainstorm I rebuild a synthetic dataset which contains exactly the same compound (species types) and similar proportion as the old one (Henrik's) using simNGS. It can be easily merged into the std python testing module (ecoli ref test). Should I just directly send it to you or I should do the test?

@guillermo-carrasco
Copy link
Contributor

Hej!

If you generated it with SimNGS it should be easily reproducible right? We just need the generation parameters to get the same output, no need to send lots of GB through the network!

@tzcoolman
Copy link
Contributor

@guillermo-carrasco I just dont understand. All I used is default setting. Do you mean all the ref genomes that Henrik used before

@brainstorm
Copy link
Collaborator Author

@tzcoolman You should write a python test that:

  1. Builds the old dataset using the "default setting" you mention.
  2. Runs FACS against it.
  3. Generate data points, or even plots out of it.

@guillermo-carrasco
Copy link
Contributor

what I meant is tha you don't have to send the dataset, but a "Howto" generate it. Automating the procedure as @brainstorm suggests would be the best solution, of course.

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them

1 similar comment
@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

9 similar comments
@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@tzcoolman
Copy link
Contributor

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

@brainstorm
Copy link
Collaborator Author

Alright, how big are those? Can you put them on your public dropbox account for now? I’ll figure out a better location for them.

@tzcoolman
Copy link
Contributor

@brainstorm 64M without human chromosome ref. I guess for human HG19 chr 8 and chr21, it will be easy to automatically download it.

@tzcoolman
Copy link
Contributor

@brainstorm.. I ll leave it on lars' desktop 'turing'. I haven't being able to use dropbox since June. It's about 100MB unzipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants