Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: user-defined functions #405

Open
jdidion opened this issue Oct 15, 2020 · 2 comments
Open

Proposal: user-defined functions #405

jdidion opened this issue Oct 15, 2020 · 2 comments

Comments

@jdidion
Copy link
Collaborator

jdidion commented Oct 15, 2020

Cross-posted from #405

Also see discussions here:

Currently, the WDL specification provides a small library of functions that meet the needs of many use-cases, but certainly not all of them. The ability to define new functions has been requested several times in the past. This proposal aims for an idiomatic specification of UDFs.

Example:

task call_funcs {
  input {
    File infile
  }

  command <<<
  process_yaml < ~{read_yaml(infile)} > output.txt
  >>>

  output {
    File outfile = "output.txt"
  }
}

func read_yaml {
  input {
    File infile
    String? encoding
  }

  command <<<
  import yaml
  with open("~{infile}", "r", encoding="~{encoding}") as inp:
    y = yaml.read(inp)
  y.pretty_print()
  >>>

  output {
    File outfile = read_string(stdout())
  }

  runtime {
    container: "python_with_yaml"
    interpreter: "python"
  }
}

The signature of the above function is: String read_yaml(File, String?)

User-defined functions are similar to tasks, with the following differences:

  • User-defined functions begin with the func keyword.
  • The order of the input parameters matters - the (left-to-right) function signature is the set of input parameters ordered from top-to-bottom.
  • There may be at most one optional input parameter (which may or may not have a default value), and it must be the last parameter in the signature.
  • Only a single output parameter is allowed.
  • Runtime attributes and Hints: TBD - if functions are executed in the same process as the calling task, runtime/hints must somehow be merged with the task's runtime/hints.
  • There is one function-specific runtime attribute:
    • sections: the section(s) in which the function may be used; defaults to "*", may be a String or Array[String] with one of the four task sections that allow expressions (input, output, command, runtime)

Similar to structs, funcs exist in a common namespace (regardless of in which WDL file they are defined); however, funcs cannot be aliased, so there must not be any name collisions between funcs defined in different WDL files in the import tree.

Once defined, a func may be used by its (unqualified) name in any command block.

In conjunction with the proposed addition of the interpreter runtime attribute, users will be able to write functions in a variety of programming languages. This raises the question of how to support functions written in different languages, or a function written in a different language than the command block. There are a few possible solutions:

  • Require that the task container (or host) environment provides all of the interpreters required by all of the functions used in the command block.
  • Use a solution such as docker compose or docker run --link to enable the commands to access executables across containers. This means that each function would need to specify its container, and the runtime would be required to dynamically compose the container of the task and all functions used by that task.
  • Execute functions in their own environments, e.g. subprocesses or separate workers. This makes executing a task similar to executing a workflow.
@mlin
Copy link
Member

mlin commented Oct 19, 2020

@jdidion what do you think of the Discussions forum for this & your other? (I just added mine there for good measure :)

@jdidion
Copy link
Collaborator Author

jdidion commented Oct 19, 2020

Sure - added both proposals there. We'll see where they get more traction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants