New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add plots for the original FACS 1.0 dataset #106

Open

brainstorm opened this issue Jan 20, 2014 · 23 comments

Assignees

Collaborator

brainstorm commented Jan 20, 2014

We should get some metrics (runtime/accuracy) from the old dataset (the one used in the original paper) with the current version of FACS.

Preferrably in an automated/reproducible way. Manually if it is too cumbersome.

@tzcoolman, @henrikstranneheim, can you take care of that?

ghost assigned henrikstranneheim

Contributor

tzcoolman commented Jan 21, 2014

@brainstorm I thought I had done that before. Plus, fastq screen only takes datasets with fastq format while the old one is in fasta. FACS2.0 and deconseq has slightly runtime difference in handling fasta and fastq format file.

Collaborator Author

brainstorm commented Jan 21, 2014

Are those plots relevant today given all the changes introduced to FACS since then?

Can they be regenerated/made by anyone else than you and/or Henrik? Where is the code to do so?

Thanks Enze!

Contributor

tzcoolman commented Jan 21, 2014

It is reproducible of course, though I didn't use a script to run it. All I have is relevant data. (Same as what Henrik had done in the paper, only counting ecoli and human chr 8 and 22) @brainstorm

Contributor

tzcoolman commented Jul 20, 2014

@arvestad @guillermo-carrasco @brainstorm I rebuild a synthetic dataset which contains exactly the same compound (species types) and similar proportion as the old one (Henrik's) using simNGS. It can be easily merged into the std python testing module (ecoli ref test). Should I just directly send it to you or I should do the test?

Contributor

guillermo-carrasco commented Jul 22, 2014

Hej!

If you generated it with SimNGS it should be easily reproducible right? We just need the generation parameters to get the same output, no need to send lots of GB through the network!

Contributor

tzcoolman commented Jul 22, 2014

@guillermo-carrasco I just dont understand. All I used is default setting. Do you mean all the ref genomes that Henrik used before

Collaborator Author

brainstorm commented Jul 22, 2014

@tzcoolman You should write a python test that:

Builds the old dataset using the "default setting" you mention.
Runs FACS against it.
Generate data points, or even plots out of it.

Contributor

guillermo-carrasco commented Jul 22, 2014

what I meant is tha you don't have to send the dataset, but a "Howto" generate it. Automating the procedure as @brainstorm suggests would be the best solution, of course.

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them

1 similar comment

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

9 similar comments

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Contributor

tzcoolman commented Jul 27, 2014

@brainstorm
It cannot be fully automatic. At least for 20 of the ref genomes, I dont know where to download them automatically

Collaborator Author

brainstorm commented Jul 27, 2014

Alright, how big are those? Can you put them on your public dropbox account for now? I’ll figure out a better location for them.

Contributor

tzcoolman commented Jul 29, 2014

@brainstorm 64M without human chromosome ref. I guess for human HG19 chr 8 and chr21, it will be easy to automatically download it.

Contributor

tzcoolman commented Aug 3, 2014

@brainstorm.. I ll leave it on lars' desktop 'turing'. I haven't being able to use dropbox since June. It's about 100MB unzipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment