-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow user-specified HTTP_PROXYs for opportunistic resources #600
Comments
This feels to me like something that WQ (or probably better yet, VC3) should be taking care of. The proxy used by the worker should be a function of where the worker runs. Specifying it in Lobster means that you can't have a proxy that is specific to the network each worker is on. I guess, as a fallback, it could be specified in Lobster, but I'll bet you that most users don't have access to a proxy server to which they can direct their traffic. Could this be made part of the WQ or VC3 environment? |
I guess we could have the VC3 glidein set |
In terms of VC3, I think the real question is how does the user of VC3 specify that a proxy is needed, and if so, how does VC3 ensure the site is providing one? |
There is a resource provider specification that somebody (system admin, etc) has to fill in at some point with head node information, resource management (condor, slurm, pbs...), etc... so the proxy (if available) could be another entry for this specification. EDIT: Looking at the vc3 client, the user can currently specify special variables in the environment like the http proxy per target. The user then would need to know how this is used in its application. Like, I could specify in my vc3 request I want to set GLIDEIN_Proxy_URL = myproxy.uchicago.edu (or HTTP_PROXY?) for my UChicago target, and I know lobster will use it for CVMFS / parrot. CMS Sites just advertise that info in the cvmfs SITECONF and applications like CRAB know how to look for it. |
Have you tried setting If setting that works for you, we should add it to the documentation somewhere. |
@matz-e : Yeah, I tried that first, but I was still getting eddie, which is why I ended up using Changing the above would be easy, but I thought having an advanced parameter to avoid the user having to export environment variables prior to running their factories would be better. |
You're right, @khurtado, I forgot about that. Yes, an advanced configuration parameter would probably be best, since it's easy to forget to export custom settings. In addition, we could add yet another environment variable, i.e., |
I like that, it would cover both situations. @klannon, opinions? |
Sorry to leave this sit for so long. I'm confused, why wouldn't we just change the master behavior not to overwrite the existing |
Overwritten is not the right word here… those variables are used to tell the worker what the master uses. This works via WQ, where the worker sets these environment variables to the master values before the wrapper is executed. Hence my suggestion to add a few dedicated variables to set the proxy for this case. I would suggest setting the proxy to |
I guess I'm arguing for a bigger shift. Basically, take the responsibility for setting these values completely away from the master. In default Lobster usage (e.g. non-VC3) couldn't we just as easily include these variables as part of the factory config? (Maybe @btovar could weigh in?) In VC3 usage, providing a proxy server and communicating that to the task is part of specifying the necessary resources, I think, not something the task should be doing for itself. So, basically, I guess what I'm arguing for is to have Lobster not set those values at all in the master, but instead make it part of what you need to do to set up the worker. Do you think that would work? |
I introduced those settings in #298, to remove having our T3 stuff hardcoded. In principle, these settings are worker-specific and should not be set on the master. For user convenience, particularly for running at Notre Dame, the code as is makes sense. Minimal user effort to start an instance of Lobster that just works. If we can provide these values as factory configuration values, I'm OK with removing/reverting the settings. After all, that will shrink the code base, and make the master more robust. |
I don't see any options in the factory or worker to specify environment variables. @btovar, if we could have a factory setting |
Currently, lobster tries to detect an
HTTP_PROXY
on the Worker and it also tries to detect a proxy on the Master machine as a fallback.This works fine on:
eddie.crc.nd.edu:3128
detected from the master machine and all WNs have access to it.But it breaks if:
A workaround for this is using
export HTTP_PROXY=something
before starting your work_queue_factory, because the factory exports the submit environment to the worker nodes, but this breaks work_queue, since the WQ catalog will try to connect through this proxy and that's not guaranteed to work.My current workaround is exporting GLIDEIN_Proxy_URL before running the factory instead. This makes work queue connect to the catalog without proxies but parrot will use it for CVMFS.
We should probably let the user specify the fallback proxy as an advanced parameter, so it only tries to detect a proxy in the master machine if this advanced parameter is unset. That proxy will only be used in cases the wrapper in the WN can't detect a valid proxy. Does that make sense? Is there a better approach to solve this?
The text was updated successfully, but these errors were encountered: