Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A dsc-setup command #208

Open
gaow opened this issue Nov 26, 2019 · 4 comments
Open

A dsc-setup command #208

gaow opened this issue Nov 26, 2019 · 4 comments

Comments

@gaow
Copy link
Member

gaow commented Nov 26, 2019

To continue our in person discussion for a dsc-setup command: what it should do is to provide a one line command to setup a github friendly template for DSC. It should include the basic suggested script structure / hierarchy for DSC benchmarks, and optionally templates to query and explore results (for dscrutils::dscquery, with potentially workflowr structure or dscrutils::shiny_plot in mind).

implementation-wise I suggest it be written as a command tool that people type dsc-setup in terminal to use it, but written in R language -- this makes it easier for the lab to maintain and change it, and we can potentially borrow codes from workflowr already for initializing a project.

For starters this ticket discuss what we want to achieve. My current DSC organization is:

- scripts
   - module1.R
   - module2.R
   - whatever-lumped-scripts.R
- modules
   - module1.dsc
   - module2.dsc
   - whatever-lumped-modules.dsc
benchmark1.dsc
benchmark2.dsc
...

where benchmark*.dsc only has the DSC section.

We can use dsc-setup as dsc-setup name that will:

  1. create a github repo name
  2. prepare .gitignore and .gitattributes files for it
  3. setup the structure above with a README.md to explain what each folder does
  4. setup a master DSC file as name.dsc with the DSC section only, with contents:
#!/usr/bin/env dsc
%import modules/*.dsc

DSC:
   output: "name"

I don't think it would be necessary (or encouraged) to add comments in a DSC script like this because the HTML file for exported DSC script will now contain the information (#209). That is, the file when you run DSC and see in the first line of output:

$ ./finemap.dsc --debug
INFO: DSC script exported to finemap_output.html
...
@pcarbo
Copy link
Member

pcarbo commented Nov 26, 2019

My suggestions:

  • We should call it dsc-start, dsc-init or dsc-create, since dsc-setup sounds like it could be for installing DSC, rather than creating a new DSC benchmark.

  • Use git, but don't assume GitHub (so don't create .gitattributes, or any other files specific to GitHub).

  • Make sure to commit the files in addition to creating them.

  • Don't use workflowr.

  • Don't use shiny.

  • I'm open to implementing this in R, but if we do that, it should be a function in the R package (e.g., dscstart or dscinit), not a script or command-line tool.

  • Keep the template super simple, but it should also do something. For example, it could implement the "one sample location" example, so dsc-setup mybenchmark creates

$ tree mybenchmark
mybenchmark
├── analysis
│   └── summarize_results.R
└── dsc
    ├── modules
    │   ├── abs_err.R
    │   ├── mean.R
    │   ├── median.R
    │   ├── normal.R
    │   ├── sq_err.R
    │   └── t.R
    └── mybenchmark.dsc

@gaow
Copy link
Member Author

gaow commented Nov 27, 2019

@pcarbo we can setup a github repository to put in the aforementioned template that actually does something. I can put in a version if you create such a repo under stephenslab github account.

@gaow
Copy link
Member Author

gaow commented Nov 27, 2019

Or, maybe we can do it inside dscrutils package eg inst folder? See it here. From my experience using DSC:

  1. I don't think it is bad idea to add a .gitattributes file.
  2. I recommend using %include so one can have multiple "main" DSC files like template.dsc for various specific benchmarks.
  3. .gitignore should at least include our default output directory, to prevent novice users from adding the output to github.
  4. I recommend setting chmod +x to the main DSC file so users can execute the file directly ./template.dsc

gaow added a commit that referenced this issue Nov 27, 2019
@pcarbo
Copy link
Member

pcarbo commented Nov 27, 2019

Or, maybe we can do it inside dscrutils package eg inst folder? See it here.

That's a great start, thanks.

  1. I don't think it is bad idea to add a .gitattributes file.

I'm more okay with it here---it was annoying when it was being generated every time I ran dsc.

I recommend using %include so one can have multiple "main" DSC files like template.dsc for various specific benchmarks.

I would say this is going against the principle of this being a simple DSC. And for that matter, I think having the DSC specified in a single file is one of the things that makes DSC attractive.

I'm fine with 3 and 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants