-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate pex as a deployment option for streamparse #212
Comments
Completely agree on this one. Oh, and by the way, |
@dan-blanchard nice! 💯 |
This 20-minute talk on YouTube (done 100% at command-line) gives a nice overview of https://www.youtube.com/watch?v=NmpnGhRwsu0 I added info to my issue body based on this talk. |
I haven't had a chance to watch the video yet, but I definitely lean more toward solution one, with the exception that I think there should be one .pex file per Python version so that people could easily have some components use Python 2 and others use 3 (or pypy).
|
Just discovered that pex does not support editable requirements ( |
This comment has a workaround. Basically we'd need to add a step to our process where we cloned all the projects separately first, because pex is fine with just chucking package directories in there. |
it's worth noting that e.g.:
happy to discuss further or help answer any questions you guys might have about pex. |
Thanks for the info @kwlzn! |
Closing in favor of #445. |
Suggested by one of our users, it might be nice to package Python environments in pex files, which would include code and dependencies. This could be a mechanism of eliminating the need to ever use virtualenv or fabric. I don't know what the other implications are of using pex, but someone could certainly explore. I don't think it would be too hard to get it working even with a current version of streamparse, based on the rough description in the docs.
There seem to be two options.
.pex
file out of dependencies and include it in the topology JAR with a standard name liketopology.pex
. Rather than calling/virtualenv/topology-venv/bin/python -m streamparse.run <class_name>
to run a component, we actually calltopology.pex -m streamparse.run <class_name>
. Sincepex
supports-m
similar to a Python interpreter, this should "just work". This seems like the preferred option -- my only concern here is a "platform build mismatch" issue, e.g. if a dependency is a C extension module that needs to be built for the target platform rather than the development platform, building the .pex file locally may not produce the right thing (?). This might not matter as much if a bdist exists for that module and pex's--platform
argument is used.pex
CLI tool. That is, rather than bundling topology.pex inside the JAR, we actually bundle the dependency list inrequirements.txt
format. We then make the topology entrypointpex -r requirements.txt -m streamparse.run
. This ensures that the environment is built on the remote server upon topology startup; the main downside is that this command will probably take a long time to run the first time (before a pip cache kicks in?) and I'm not entirely sure how friendly Storm will be to that. I wonder if invoking it once upon topology submit via Fabric could be a trick to warm up the pip cache while also catching requirement specification errors at submit-time rather than topology run-time.Whichever option we pick, it seems like it could offer some improvements to the virtualenv approach, but I haven't dug into pex and tested it out too much yet.
The text was updated successfully, but these errors were encountered: