You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When input sequencing files are in fastq format, how to use the yaml file is tricky. For example the paired fastq file names have to ended with _1.fq.gz and _2.fg.gz, but such converntions are not mentioned anywhere.
Our code does not validate the file properly either, a few cases below:
If SM tag is missed in the yaml file, cgpmap runs without complains, but will ignore rg id in the yaml file and generate a random one.
Identical fastq file names in the file will not trigger any error/warning.
When a input file is missed in the file, it will not complain.
I think it'll be better if we have a flag option specificly for single-ended fastqs. Cgpmap will assume inputs are paired-ended, and complains if inputs are neither interleaved nor paired, and if it's really a single ended input, user will need to label them specificly. Currently it just went on with its own assumptions silently.
The text was updated successfully, but these errors were encountered:
For example the paired fastq file names have to ended with _1.fq.gz and _2.fg.gz
I think this could be handled better by modifying the yaml format slightly, this was added very late previously you had no ability to include any header information. Pushing the files into the readgroup records would allow explicit pairing, but work on the underlying calls to the aligner will be needed.
SM: sample# the actual readgroupsREADGRPS:
ID: 9files:
- fq_1_00001.fq.gz
- fq_2_00001.fq.gzCN: centreDS: Please don't use multilineLB: Library_idPI: 500PL: FORCED TO UPPERPM: HiSeq-XTenPU: 1234_1ID: 10files:
- fq_1.fq.gz
- fq_2.fq.gz
When input sequencing files are in fastq format, how to use the yaml file is tricky. For example the paired fastq file names have to ended with
_1.fq.gz
and_2.fg.gz
, but such converntions are not mentioned anywhere.Our code does not validate the file properly either, a few cases below:
SM
tag is missed in the yaml file, cgpmap runs without complains, but will ignore rg id in the yaml file and generate a random one.I think it'll be better if we have a flag option specificly for single-ended fastqs. Cgpmap will assume inputs are paired-ended, and complains if inputs are neither interleaved nor paired, and if it's really a single ended input, user will need to label them specificly. Currently it just went on with its own assumptions silently.
The text was updated successfully, but these errors were encountered: