-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submit the SoS job submitter to a compute node #5
Comments
A possible interface is to change the https://github.com/cumc/dsc/blob/master/vignettes/one_sample_location/midway.yml adding to this section default:
queue: midway2
instances_per_job: 40
nodes_per_job: 1
instances_per_node: 4
cpus_per_instance: 1
mem_per_instance: 2G
time_per_instance: 3m this extra lines: submitter_mem: 6G
submitter_walltime: 36h then with
to execute the DSC by submitting SoS jobs on a compute node. As you can see, the two This solution can be implemented in DSC code and not requesting a new feature from SoS. And we can allow something like submitter_mem: None
submitter_walltime: None to say that we want to submit from where we execute the command, not submitting it to a node which will then submit all jobs. -- this is the current behavior anyways. |
It is never a good idea to have long lasting processes running on headnode, even just for job submission with controlled ram and cpu usage. vatlab/sos#1407 now works (check the last few posts for sample configuration) and let me know if it works for your cluster. Note that
etc can be used to check status of workflows (with IDs starting with |
Currently, DSC runs in two modes:
dsc ...
dsc ... --host
where--host
option loads a host configuration file such that an SoS job submitter keeps running on the background, of a cluster's login node, for example, and computation jobs are submitted to each cluster node.The problem with 2 obviously is that running the SoS job submitter on the background can be a bit resource intensive and not welcomed on the cluster login node. So to run a DSC job on the cluster, one has to do something like this:
This is a bit tedious. We'd like an interface and mechanism to be able to submit such a job to a compute node, which then submits jobs to the cluster. Perhaps it should be done on the SoS end? I'm going to submit a ticket at SoS repo.
The text was updated successfully, but these errors were encountered: