With prepared dataset,
python learn.py dataset/dataset_learn_v2_005.limma.csv dataset/dataset_learn_v2_005.label.csv --reduce_updown --save output/model_v2 --save_csv output/model_v2 --epoch_count 1000 --batch_size 0
--reduce_updown
Use this parameter if dataset is separated into up and down column.--save
Directory to save model data--save_csv
Directory to save model data (in csv form)--epoch_count
epoch to learn--batch_size
learning batch size
python test.py dataset/dataset_learn_v2_005.limma.csv dataset/dataset_learn_v2_005.label.csv --load output/model_v2 --save output/model_v2_test
python test.py dataset/dataset_test_005.limma.csv dataset/dataset_test_005.label.csv --load output/model_v2 --save output/model_v2_test
--load
Directory of learn.py output--save
Directory to save test result
python GOenrichment.py $1/output/params_generator.csv --trait2genes GOBPname2gene.arabidopsis.txt --column_name "heat,salt,drought,cold" --count_cut 500 -o $1/output/GSEA_learn_top500.csv --descending --max_trait_cut 1
--trait2genes
Path to trait-to-gene file.--column_name
Column names including p-value.--count_cut
How many gene set to be shown?-o
Path to output--descending
Result data in descending p-value order--max-trait-cut
Remove genes with too little traits.
Just prepare your expression matrix (CEL file processed) and metadata (may write TSD header or generate from CSV). then use these commands:
# split total dataset file into time-series-data formatted file
python ../dataset.py tsd -f cold.csv
python ../dataset.py tsd -f heat.csv
# read directory just for test
python ../dataset.py read -d ./
# without any processing, just output raw p-value
python ../dataset.py compile -d ./ --tool limma -t dataset_test_005
# processing p-value /w threshold .05, (signed)
python ../dataset.py compile -d ./ --tool limma -t dataset_test_005 --pvalue 0.05
# processing p-value /w threshold .05, with reindexing & separating up/down signal (unsigned)
python ../dataset.py compile -d ./ --tool limma -t dataset_test_005 --pvalue 0.05 --updown --reindex_df ../dataprocess/ttest_pval.csv
# generate gene list order for comparison with other result (optional)
python dataset.py gen_genelist -f data/GSE3326_1.tsd -t dataset/index
# create dataset with pvalue 0.05
python dataset.py compile -d data_test/ --tool limma -t dataset/dataset_test_005 --pvalue 0.05 --reindex_file dataset/index.txt --updown --label_index "heat,salt,drought,cold"
# create dataset with additional filter
python dataset.py compile -d data/ --tool limma -t dataset/dataset_learn_v2_005 --pvalue 0.05 --reindex_file dataset/index.txt --updown --label_index "heat,salt,drought,cold" --filters "Species:Arabidopsis Thaliana,MinRepCnt:2"
Then DEG infomation will be stored at dataset_test.csv
matrix.
You can use --filter
to select TSData with specific condition (with column name and value).