-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Inversion of Control features should be supported in notebooks #1088
Comments
Hi @marr75, Thanks for your feedback! Some thoughts.
The quick way to fix this is to remove the injected cell before committing to git. The reason why we're injecting absolute paths is to reduce ambiguity because when you execute However, I agree this creates the problem where paths resolve to different absolute values in different environments. I think an alternative would be to inject them as paths relative to the Re serializers and clients: I like your |
@edublancas I'm blown away by how responsive you are 😁 Honestly, digging deeper into the docs, you provide an example of exploring outputs of the rest of the pipeline using Related, many of the docs show examples where the task is responsible for serialization and "client" concerns, obtaining only paths from the upstream and product. I think this is ignoring the most powerful features of ploomber. I may be able to help here, too. |
Yes! Shortly after posting my initial response, I thought that adding the context functionality is simple since we already have functions for finding and parsing the dag, so I'm glad you found it. Feel free to open a draft PR to give early feedback. Also, you're welcome to join our community in case you have quick questions while working on setting up a dev environment, testing, etc. |
We're working on adopting ploomber as our pipeline management technology. In early experimentation, I've found that many of the best inversion of control features of ploomber don't seem to be supported for notebooks. I find this odd because of the amount of attention and ink spent on integrating jupyter notebooks.
Examples (in order of importance):
The extensions you've added to make a jupyter[lab] server work well with plain 'ol .py files are very useful but I was disappointed in the small subset of features available to notebook tasks. This breaks the most powerful features of ploomber when using notebooks. Pure python tasks can use clients and serializers to improve testability and make large changes possible with tiny reliable changes to the pipeline spec. You can develop using human-readable formats and the local filesystem and then use binary formats and cloud storage in production with a couple of lines of yaml when using pure python but this is not possible with notebooks. Further, ploomber teaches certain concepts and expectations around upstreams and products when using python tasks that are not valid when using notebooks.
Suggestion: abstract upstream and product into python objects you import instead of injecting dictionaries of strings into notebooks.
could become:
The text was updated successfully, but these errors were encountered: