Telegraf reloads URL-based/remote config on a specified interval #8730

sjwang90 · 2021-01-21T21:55:56Z

Feature Request

Proposal:

Reload configuration via URL

Current behavior:

The only time the Telegraf config is (re)loaded is at agent startup. Agent currently does not poll for new configuration via URL-based config.

Desired behavior:

Reload configuration based on a designated frequency for the reload and include a jitter time to ensure we avoid a thundering herd for the reading of the config from the server hosting it.

Use case:

As it stands, the agent requests its configuration at startup based on the supplied URL, but users can modify the configuration at any point afterwards. Such a configuration change requires a signaling mechanism to trigger Telegraf to reload the config -- which doesn't currently exist. A simple time-based mechanism can ensure that all active agents will refresh within some reasonable time window.

One workaround is that in Linux, you can use the Exec plugin to run a “HUP” against the agent every hour, which triggers Telegraf to reload. But there is no equivalent HUP for Windows, and so whilst killing the agent and while waiting for Windows ServiceControlManager to restart it, there is a short blackout period (as well as an error in the Windows Event Log). These are not ideal.

The text was updated successfully, but these errors were encountered:

sjwang90 · 2021-01-27T21:06:32Z

Previous issue on this same topic: #5502. Keeping this issue open since it addresses discussion with @schmorgs

From @danielnelson

Thanks for the feedback, we have some plans to allow event triggered updates based on this prototype I worked on some time back: https://github.com/danielnelson/tgconfig. For now though I don't want to add any new cli options since I don't want to commit to keeping support for them.

sjwang90 · 2021-08-16T18:36:25Z

Continuing the conversation from #8529 (comment) here:

As there is not guarantee that server holding the remote config supports HEAD (e.g. InfluxDB server doesn't) and also there is not guarantee that server will return info about the resource state, e.g. ETag header or Date-Modified header, easiest solution could be to store hash (MD5) of the current config and periodically (there should be parametrized time interval) download copy of the remote config, compute hash and compare. If changed, restart.

The question is, what should be default time interval for check to not bother server much? 1 minute?

Based off this comment - the implementation on the Telegraf may not be too complicated but there could issues on the remote config server side (ex: Telegraf Config stored in InfluxDB cloud).

sjwang90 · 2021-08-16T18:42:00Z

We'll look to implement this after a Config API is implemented with Telegraf since it should cover this functionality.

We want to make sure we're thinking of this feature not as users being limited on being able to reload their configuration through a SIGHUP but as changing the user experience to be able to update your configuration and the Telegraf agent detecting and implementing those changes.

DavidBoman · 2021-09-01T19:12:03Z

+1

I'm using a scheduled task in Windows now to restart the Telegraf service but that's a quite ugly workaround. Being able to trigger a reload server side or having Telegraf polling for new config and reloading on a change would be a great solution.

clever-trevor · 2021-09-01T19:46:48Z

Agree with @DavidBoman

An inbuilt process to retry a config load on a predetermined schedule would help simplify config deployment in a large environment

Caveat being that if the config endpoint is down, the agent does not crash and carries on with cached config

The strategic agent api config load is good, but in an Enterprise env, adds complexity in environments such as DMZ where the endpiint may not be immediately contactable

…uxdata#8730

…fluxdata#8730

influxdata#8730

powersj · 2021-11-18T15:26:30Z

Caveat being that if the config endpoint is down, the agent does not crash and carries on with cached config

The above and several other scenarios are essential to figuring out before landing this feature. For example, what happens if the file is un-reachable, there are errors in the configuration itself, a way for the user to get feedback on whether an update occurred, or some mechanism of security.

The next steps are to design and flesh out the above and other cases. Then we can work to add the CLI support for this feature.

Trovalo · 2021-11-24T15:03:15Z

a way for the user to get feedback on whether an update occurred or some mechanism of security.

What about adding some more metrics to the input.internal?
This will allow users to monitor and set up whatever alert they want.

Not sure about what can be provided since that might depend on the implementation but something like
"LastConfigUpdate" → a date that will be saved as a string in influxDB (since it has no "date" datatype), not sure how handy is to work with a string, to build alert rules from Grafana/Kapacitor
"IsUpdated" → a boolean that shows if the config endpoint was reachable or not
"Message" → text field with the error itself, whatever it is (timeout, not rachable, unauthorized)

pdrivom · 2024-02-17T00:21:17Z

+1

paulojmdias · 2024-05-07T22:01:35Z

Is there any plan to add this feature @powersj ?

powersj · 2024-05-07T23:21:03Z

@paulojmdias - started working on a spec for this: #15321 take a look and please comment.

This introduces a new config-url-watch-interval option, which when set will, at each interval, check the Last-Modified header of the file to determine if telegraf should reload. If the header is not available then the watcher is disabled for the file. fixes: influxdata#8730

sjwang90 added feature request Requests for new plugin and for new features to existing plugins area/configuration area/agent labels Jan 21, 2021

sjwang90 changed the title ~~Reload config based on a certain amount of time~~ Reload URL-based config on a specified time Jan 27, 2021

sjwang90 mentioned this issue Jan 27, 2021

Telegraf should check by itself if remote config has changed to reload the agent #5502

Closed

helenosheaa added the platform/windows label Jan 29, 2021

reimda changed the title ~~Reload URL-based config on a specified time~~ Reload URL-based config on a specified interval Feb 25, 2021

sjwang90 mentioned this issue Jul 30, 2021

feat: Added possibility to detect local config file changes and reload #8529

Closed

3 tasks

sjwang90 changed the title ~~Reload URL-based config on a specified interval~~ Telegraf reloads URL-based/remote config on a specified interval Aug 16, 2021

sjwang90 mentioned this issue Oct 28, 2021

Telegraf --watch-config need to support poll from http config #9992

Closed

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 12, 2021

adds periodical http config reloads and change detection, closes infl…

866fad3

…uxdata#8730

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 12, 2021

adds periodical http config reloads and change detection, closes infl…

de96b48

…uxdata#8730

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 12, 2021

adds periodical http config reloads and change detection, resolves in…

b81734b

…fluxdata#8730

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 12, 2021

feat: adds periodical http config reloads and change detection, resolves

081a705

influxdata#8730

toni-moreno mentioned this issue Nov 14, 2021

feat: adds periodical http config reloads and change detection #10102

Closed

3 tasks

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 14, 2021

feat: adds periodical http config reloads and change detection, resolves

1485ae4

influxdata#8730

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 16, 2021

feat: add periodical http based config reload, resolves influxdata#8730

8eaa655

toni-moreno pushed a commit to toni-moreno/telegraf that referenced this issue Nov 17, 2021

feat: add periodical http based config reload, resolves influxdata#8730

3513c34

sjwang90 mentioned this issue Nov 29, 2021

Internal plugin additional fields #8607

Open

powersj mentioned this issue Nov 9, 2022

Git repository as config directory #12209

Closed

powersj mentioned this issue Feb 15, 2024

Cache remote config localy and use it if remote server not working #5501

Closed

powersj self-assigned this Apr 29, 2024

powersj mentioned this issue May 21, 2024

feat(config): Allow reloading on URL config change #15388

Merged

1 task

DStrand1 closed this as completed in #15388 Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Telegraf reloads URL-based/remote config on a specified interval #8730

Telegraf reloads URL-based/remote config on a specified interval #8730

sjwang90 commented Jan 21, 2021 •

edited by timhallinflux

Loading

sjwang90 commented Jan 27, 2021

sjwang90 commented Aug 16, 2021

sjwang90 commented Aug 16, 2021

DavidBoman commented Sep 1, 2021

clever-trevor commented Sep 1, 2021

powersj commented Nov 18, 2021

Trovalo commented Nov 24, 2021

pdrivom commented Feb 17, 2024

paulojmdias commented May 7, 2024

powersj commented May 7, 2024

Telegraf reloads URL-based/remote config on a specified interval #8730

Telegraf reloads URL-based/remote config on a specified interval #8730

Comments

sjwang90 commented Jan 21, 2021 • edited by timhallinflux Loading

Feature Request

Proposal:

Current behavior:

Desired behavior:

Use case:

sjwang90 commented Jan 27, 2021

sjwang90 commented Aug 16, 2021

sjwang90 commented Aug 16, 2021

DavidBoman commented Sep 1, 2021

clever-trevor commented Sep 1, 2021

powersj commented Nov 18, 2021

Trovalo commented Nov 24, 2021

pdrivom commented Feb 17, 2024

paulojmdias commented May 7, 2024

powersj commented May 7, 2024

sjwang90 commented Jan 21, 2021 •

edited by timhallinflux

Loading