Skip to content

Options

Vinh Tran edited this page Mar 26, 2023 · 1 revision

Options for fDOG with a single seed sequence

Required arguments:
  --seqFile SEQFILE     Input file containing the seed sequence (protein only) in fasta format
  --jobName JOBNAME     Job name. This will also be file name for the output
  --refspec REFSPEC     Reference taxon. It should be the species the seed sequence was derived from

Non-default directory options:
  --outpath OUTPATH     Output directory
  --hmmpath HMMPATH     Path for the core ortholog directory
  --corepath COREPATH   Path for the core taxa directory
  --searchpath SEARCHPATH
                        Path for the search taxa directory
  --annopath ANNOPATH   Path for the pre-calculated feature annotion directory
  --pathFile PATHFILE   Config file contains paths to data folder (in yaml format)

Core compilation options:
  --coreOnly            Compile only the core orthologs
  --reuseCore           Reuse existing core set of your sequence
  --minDist {species,genus,family,order,class,phylum,kingdom,superkingdom}
                        Minimum systematic distance of primer taxa for the core set compilation. Default: genus
  --maxDist {species,genus,family,order,class,phylum,kingdom,superkingdom}
                        Maximum systematic distance of primer taxa for the core set compilation. Default: kingdom
  --coreSize CORESIZE   Maximul number of orthologs in core set. Default: 6
  --coreTaxa CORETAXA   List of primer taxa that should exclusively be used for the core set compilation
  --CorecheckCoorthologsOff
                        Turn off checking for co-ortholog of the reverse search during the core compilation
  --coreRep             Obtain only the sequence being most similar to the corresponding sequence in the core set rather than all
                        putative co-orthologs
  --coreHitLimit COREHITLIMIT
                        Number of hits of the initial pHMM based search that should be evaluated via a reverse search. Default: 3
  --distDeviation DISTDEVIATION
                        The deviation in score in percent (0 = 0 percent, 1 = 100 percent) allowed for two taxa to be considered
                        similar. Default: 0.05
  --alnStrategy {local,glocal,global}
                        Specify the alignment strategy during core ortholog compilation. Default: local

Ortholog search strategy options:
  --searchTaxa SEARCHTAXA
                        Specify file contains list of search taxa
  --group GROUP         Allows to limit the search to a certain systematic group
  --checkCoorthologsRefOff
                        Turn off checking for co-ortholog of the reverse search during the final ortholog search
  --rbh                 Requires a reciprocal best hit during the ortholog search to accept a new ortholog
  --rep                 Obtain only the sequence being most similar to the corresponding sequence in the core set rather than all
                        putative co-orthologs
  --lowComplexityFilter
                        Switch the low complexity filter for the blast search on. Default: False
  --evalBlast EVALBLAST
                        E-value cut-off for the Blast search. Default: 0.0001
  --evalHmmer EVALHMMER
                        E-value cut-off for the HMM search. Default: 0.0001
  --hitLimit HITLIMIT   number of hits of the initial pHMM based search that should be evaluated via a reverse search. Default: 10
  --scoreCutoff SCORECUTOFF
                        Define the percent range of the hmms core of the best hit up to which a candidate of the hmmsearch will be
                        subjected for further evaluation. Default: 10

FAS options:
  --coreFilter {relaxed,strict}
                        Specifiy mode for filtering core orthologs by FAS score. In 'relaxed' mode candidates with insufficient FAS
                        score will be disadvantaged. In 'strict' mode candidates with insufficient FAS score will be deleted from the
                        candidates list. The option '--minScore' specifies the cut-off of the FAS score.
  --minScore MINSCORE   Specify the threshold for coreFilter. Default: 0.75

Other I/O options:
  --append              Append the output to existing output files
  --force               Overwrite existing ortholog search output files
  --forceCore           Overwrite existing core set of your sequence
  --noCleanup           Temporary output will NOT be deleted. Default: False
  --debug               Set this flag to obtain more detailed information about the ortholog search progress
  --debugCore           Set this flag to obtain more detailed information about the core compilation actions
  --silentOff           Show more output to terminal

Other options:
  --fasOff              Turn OFF FAS support
  --aligner {mafft-linsi,muscle}
                        Choose between mafft-linsi or muscle for the multiple sequence alignment. DEFAULT: muscle
  --cpus CPUS           Determine the number of threads to be run in parallel. Default: 4

Options for fDOG with multiple seed sequences are almost the same, excepts --seqFile, it requires --seqFolder for the input folder. Additionally, it is able to keep the individual outputs with the option --keep, otherwise, only the merged outputs are returned.