-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telegraf reloads URL-based/remote config on a specified interval #8730
Comments
Previous issue on this same topic: #5502. Keeping this issue open since it addresses discussion with @schmorgs From @danielnelson
|
Continuing the conversation from #8529 (comment) here:
Based off this comment - the implementation on the Telegraf may not be too complicated but there could issues on the remote config server side (ex: Telegraf Config stored in InfluxDB cloud). |
We'll look to implement this after a Config API is implemented with Telegraf since it should cover this functionality. We want to make sure we're thinking of this feature not as users being limited on being able to reload their configuration through a SIGHUP but as changing the user experience to be able to update your configuration and the Telegraf agent detecting and implementing those changes. |
+1 I'm using a scheduled task in Windows now to restart the Telegraf service but that's a quite ugly workaround. Being able to trigger a reload server side or having Telegraf polling for new config and reloading on a change would be a great solution. |
Agree with @DavidBoman An inbuilt process to retry a config load on a predetermined schedule would help simplify config deployment in a large environment Caveat being that if the config endpoint is down, the agent does not crash and carries on with cached config The strategic agent api config load is good, but in an Enterprise env, adds complexity in environments such as DMZ where the endpiint may not be immediately contactable |
The above and several other scenarios are essential to figuring out before landing this feature. For example, what happens if the file is un-reachable, there are errors in the configuration itself, a way for the user to get feedback on whether an update occurred, or some mechanism of security. The next steps are to design and flesh out the above and other cases. Then we can work to add the CLI support for this feature. |
What about adding some more metrics to the input.internal? Not sure about what can be provided since that might depend on the implementation but something like |
+1 |
Is there any plan to add this feature @powersj ? |
@paulojmdias - started working on a spec for this: #15321 take a look and please comment. |
This introduces a new config-url-watch-interval option, which when set will, at each interval, check the Last-Modified header of the file to determine if telegraf should reload. If the header is not available then the watcher is disabled for the file. fixes: influxdata#8730
This introduces a new config-url-watch-interval option, which when set will, at each interval, check the Last-Modified header of the file to determine if telegraf should reload. If the header is not available then the watcher is disabled for the file. fixes: influxdata#8730
This introduces a new config-url-watch-interval option, which when set will, at each interval, check the Last-Modified header of the file to determine if telegraf should reload. If the header is not available then the watcher is disabled for the file. fixes: influxdata#8730
This introduces a new config-url-watch-interval option, which when set will, at each interval, check the Last-Modified header of the file to determine if telegraf should reload. If the header is not available then the watcher is disabled for the file. fixes: influxdata#8730
Feature Request
Proposal:
Reload configuration via URL
Current behavior:
The only time the Telegraf config is (re)loaded is at agent startup. Agent currently does not poll for new configuration via URL-based config.
Desired behavior:
Reload configuration based on a designated frequency for the reload and include a
jitter
time to ensure we avoid a thundering herd for the reading of the config from the server hosting it.Use case:
As it stands, the agent requests its configuration at startup based on the supplied URL, but users can modify the configuration at any point afterwards. Such a configuration change requires a signaling mechanism to trigger Telegraf to reload the config -- which doesn't currently exist. A simple time-based mechanism can ensure that all active agents will refresh within some reasonable time window.
One workaround is that in Linux, you can use the Exec plugin to run a “HUP” against the agent every hour, which triggers Telegraf to reload. But there is no equivalent HUP for Windows, and so whilst killing the agent and while waiting for Windows ServiceControlManager to restart it, there is a short blackout period (as well as an error in the Windows Event Log). These are not ideal.
The text was updated successfully, but these errors were encountered: